Izidor Mlakar, Zdravko Kacic, Matej Borko, Matej Rojc



A Novel Unity-based Realizer for the Realization of Conversational Behavior on Embodied Conversational Agents

pdf PDF


Embodied conversational agents are virtual entities that tend to imitate as many features of face-face dialogs as possible. In order to achieve this goal, the ability to reproduce synchronized verbal and co-verbal signals coupled into conversational behavior becomes essential. Further, signals such as social cues, attitude (emotions), personality, eye-blinks, and spontaneous head movement are equally important. Modern 3D environments and 3D modeling tools, such as: Maya, Daz3D, Blender, Panda3D and Unity have opened up a completely new possibilities to design virtual entities, which appear almost (if no completely) like real-life persons. However, the modern 3D technology is not designed to handle highly dynamic and interchangeable contexts such as human interaction. Therefore, mostly animations are prepared in advance and support limited diversity as well as limited capacity to adapt to a new set of parameters. In this paper EVA realizer engine, which is a part of proprietary behavior realization components of EVA Framework, is presented. The represented engine is based on Unity game engine. EVA realizer exploits benefits of modern game engines as well as extend them with requirements of co-verbal realizers, by providing interpreter and manager for dynamic and in real-time generated animation. The animation is created and modeled by proprietary external co-verbal behavior generator component.


embodied conversational agents, co-verbal realizers, animation, virtual reality, mixed reality, multimodal interaction


[1] Luger, E., & Sellen, A. (2016, May). Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5286-5297). ACM.

[2] Ochs, M., Pelachaud, C., & Mckeown, G. (2017). A User Perception--Based Approach to Create Smiling Embodied Conversational Agents. ACM Transactions on Interactive Intelligent Systems (TiiS), 7(1), 4.

[3] Fabian, R., & Alexandru-Nicolae, M. (2009). Natural language processing implementation on Romanian ChatBot. In WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering (No. 5). WSEAS.

[4] Malcangi, M. (2009). Soft-computing methods for text-to-speech driven avatars. In Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering (pp. 288- 292). World Scientific and Engineering Academy and Society (WSEAS).

[5] Kuhnke, F., & Ostermann, J. (2017, July). Visual speech synthesis from 3D mesh sequences driven by combined speech features. In Multimedia and Expo (ICME), 2017 IEEE International Conference on (pp. 1075-1080). IEEE.

[6] Caridakis, G., & Karpouzis, K. (2004). Design and implementation of a greek sign language synthesis system. WSEAS Transactions on Systems, 3(10), 3108-3113.

[7] Rojc, M., Presker, M., Kačič, Z., & Mlakar, I. (2014). TTS-driven Expressive Embodied Conversation Agent EVA for UMB-SmartTV. International journal of computers and communications, 8, pp. 57-66.

[8] Tolins, J., Liu, K., Neff, M., Walker, M. A., & Tree, J. E. F. (2016). A Verbal and Gestural Corpus of Story Retellings to an Expressive Embodied Virtual Character. In LREC.

[9] Esposito, A., Esposito, A. M., & Vogel, C. (2015). Needs and challenges in human computer interaction for processing social emotional information. Pattern Recognition Letters, 66, 41-51.

[10] Kok, K. I., & Cienki, A. (2016). Cognitive Grammar and gesture: Points of convergence, advances and challenges. Cognitive Linguistics, 27(1), 67-100.

[11] Kopp, S., & Bergmann, K. (2017, April). Using cognitive models to understand multimodal processes: The case for speech and gesture production. In The Handbook of Multimodal-Multisensor Interfaces (pp. 239- 276). Association for Computing Machinery and Morgan & Claypool.

[12] Pelachaud, C. (2015, May). Greta: an interactive expressive embodied conversational agent. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (pp. 5-5). International Foundation for Autonomous Agents and Multiagent Systems.

[13] Neff, M. (2016). Hand Gesture Synthesis for Conversational Characters. Handbook of Human Motion, 1-12.

[14] Rojc, M., Mlakar, I., 2016. An Expressive Conversational-behavior Generation Model for Advanced Interaction Within Multimodal User Interfaces, (Computer Science, Technology and Applications). Nova Science Publishers, Inc., Corp., New York, 234.

[15] Rojc, M., Mlakar, I., & Kačič, Z. (2017). The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm. Engineering Applications of Artificial Intelligence, 57, 80-104.

[16] Gratch, J., Hartholt, A., Dehghani, M., & Marsella, S. (2013). Virtual humans: a new toolkit for cognitive science research. Applied Artificial Intelligence, 19, 215-233.

[17] Thiebaux, M., Marsella, S., Marshall, A. N., & Kallmann, M. (2008, May). Smartbody: Behavior realization for embodied conversational agents. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 1 (pp. 151-158). International Foundation for Autonomous Agents and Multiagent Systems.

[18] Pelachaud, C., 2015. Greta: an interactive expressive embodied conversational agent. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, (pp. 5-5).

[19] Klaassen, R., Hendrix, J., Reidsma, D., & van Dijk, B. (2013). Elckerlyc Goes Mobile Enabling Natural Interaction in Mobile User Interfaces.

[20] Heloir, A., & Kipp, M. (2010). Real-time animation of interactive agents: Specification and realization. Applied Artificial Intelligence, 24(6), 510-529.

[21] Kolkmeier, J., Bruijnes, M., Reidsma, D., & Heylen, D. (2017, August). An asap realizerunity3d bridge for virtual and mixed reality applications. In International Conference on Intelligent Virtual Agents (pp. 227-230). Springer, Cham.

[22] Mlakar, I., & Rojc, M. (2011). EVA: expressive multipart virtual agent performing gestures and emotions. International journal of mathematics and computers in simulation.

[23] Bédi, Branislav, et al. "Starting a Conversation with Strangers in Virtual Reykjavik: Explicit Announcement of Presence." Proceedings from the 3rd European Symposium on Multimodal Communication, Dublin, September 17-18, 2015. No. 105. Linköping University Electronic Press, 2016.

Cite this paper

Izidor Mlakar, Zdravko Kacic, Matej Borko, Matej Rojc. (2017) A Novel Unity-based Realizer for the Realization of Conversational Behavior on Embodied Conversational Agents. International Journal of Computers, 2, 205-213


Copyright © 2017 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0