Luka Kraljevic, Mladen Russo, Maja Stella



Voice Command Module for Smart Home Automation

pdf PDF


Voice control is the most prominent feature of smart home environment. In this paper, we proposed a voice command module that enables users hands-free interaction with the smart home environment. We presented three components required for simple and efficient control of the smart home devices. Wake up word component allows actual voice command processing. Speech recognition component maps spoken voice commands to text and Voice Control Interface parse that text into appropriate JSON format for home automation. We evaluate the possibility of using the voice control module in a smart home environment by separately analyzing each component of the module.


Smart Home, Speech Recognition, Voice control, Wake-up Word, Commands Parsing


[1] De Silva, L.C., Morikawa, C. and Petra, I.M., 2012. State of the art of smart homes. Engineering Applications of Artificial Intelligence, 25(7), pp.1313-1321.

[2] Picone, J., 1996. Fundamentals of speech recognition: A short course. Institute for Signal and Information Processing, Mississippi State University.

[3] Giannakopoulos, T., Tatlas, N.A., Ganchev, T. and Potamitis, I., 2005. A practical, real-time speech-driven home automation front-end.
IEEE Transactions on Consumer Electronics, 51(2), pp.514-523.

[4] McLoughlin, I.V. and Sharifzadeh, H.R., 2007, December. Speech recognition engine adaptions for smart home dialogues. In Information, Communications & Signal Processing, 2007 6th International Conference on (pp. 1-5). IEEE.

[5] Graves, A., Mohamed, A.R. and Hinton, G., 2013, May. Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international conference on (pp. 6645-6649). IEEE.

[6] Graves, A., 2012. Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711.

[7] Vinyals, O., Ravuri, S.V. and Povey, D., 2012, March. Revisiting recurrent neural networks for robust ASR. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 4085-4088). IEEE.

[8] Li, J., Zhang, H., Cai, X. and Xu, B., 2015. Towards end-to-end speech recognition for chinese mandarin using long short-term memory recurrent neural networks. In Sixteenth annual conference of the international speech communication association.

[9] Schuster, M. and Paliwal, K.K., 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), pp.2673-2681.

[10] Graves, A., 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.

[11] Hochreiter, S. and Schmidhuber, J., 1997. Long short-term memory. Neural computation, 9(8), pp.1735-1780.

[12] Graves, A., Fern√°ndez, S., Gomez, F. and Schmidhuber, J., 2006, June. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning (pp. 369-376). ACM.

[13] Hwang, K., Lee, M. and Sung, W., 2015. Online keyword spotting with a character-level recurrent neural network. arXiv preprint arXiv:1512.08903.

[14] Graves, A. and Jaitly, N., 2014, January. Towards end-to-end speech recognition with recurrent neural networks. In International Conference on Machine Learning (pp. 1764-1772).

[15] Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P. and Silovsky, J., 2011. The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding (No. EPFL-CONF-192584). IEEE Signal Processing Society.

[16] Parr, T., 2013. The definitive ANTLR 4 reference. Pragmatic Bookshelf.

[17] Panayotov, V., Chen, G., Povey, D. and Khudanpur, S., 2015, April. Librispeech: an ASR corpus based on public domain audio books. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on (pp. 5206-5210). IEEE.

[18] Warden, P., 2018. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv preprint arXiv:1804.03209.

Cite this paper

Luka Kraljevic, Mladen Russo, Maja Stella. (2018) Voice Command Module for Smart Home Automation. International Journal of Signal Processing, 3, 33-37


Copyright ¬© 2018 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0