A cloud-based middleware for multi-modal interaction services and applications

Avenoğlu, Bilgin; Koeman, Vincent J.; Hindriks, Koen V.

doi:10.3233/AIS-220161

A cloud-based middleware for multi-modal interaction services and applications

Article type: Research Article

Authors: Avenoğlu, Bilgin^{a; *} | Koeman, Vincent J.^b | Hindriks, Koen V.^b

Affiliations: [a] Faculty of Engineering, Ankara University, Ankara, Turkey | [b] Faculty of Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands

Correspondence: [*] Corresponding author. E-mail: [email protected].

Abstract: Smart devices, such as smart phones, voice assistants and social robots, provide users with a range of input modalities, e.g., speech, touch, gestures, and vision. In recent years, advancements in processing of these input channels enable more natural interaction (e.g., automated speech, face, and gesture recognition, dialog generation, emotion expression etc.) experiences for users. However, there are several important challenges that need to be addressed to create these user experiences. One challenge is that most smart devices do not have sufficient computing resources to execute the Artificial Intelligence (AI) techniques locally. Another challenge is that users expect responses in near real-time when they interact with these devices. Moreover, users also want to be able to seamlessly switch between devices and services any time and from anywhere and expect personalized and privacy-aware services. To address these challenges, we design and develop a cloud-based middleware (CMI) which helps to develop multi-modal interaction applications and easily integrate applications to AI services. In this middleware, services developed by different producers with different protocols and smart devices with different capabilities and protocols can be integrated easily. In CMI, applications stream data from devices to cloud services for processing and consume the results. It supports data streaming from multiple devices to multiple services (and vice versa). CMI provides an integration framework for decoupling the services and devices and enabling application developers to concentrate on “interaction” instead of AI techniques. We provide simple examples to illustrate the conceptual ideas incorporated in CMI.

Keywords: Software architectures for AI, multi-modal interaction, smart devices, cloud computing, integration framework

DOI: 10.3233/AIS-220161

Journal: Journal of Ambient Intelligence and Smart Environments, vol. 14, no. 6, pp. 455-481, 2022

Received 22 April 2022

Accepted 25 October 2022

Published: 29 November 2022

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia