On July 17 2014 in Karlsruhe Germany, Joris IJsselmuiden successfully defended his PhD thesis entitled “Interaction analysis in smart work environments through fuzzy temporal logic” . The examination committee consisted of Rainer Stiefelhagen, Jürgen Beyerer, Michael Beigl, Dorothea Wagner, Oliver Hummel, and Peter H. Schmitt (Fig. 1). The main publications associated with this PhD thesis are [2–9].
Interaction analysis, or the autonomous generation of situation descriptions, can be achieved through a combination of machine perception and reasoning. This PhD thesis presents a reasoning system for interaction analysis based on fuzzy metric temporal logic (FMTL) and situation graph trees (SGTs). As the presented reasoning methods can use any combination of machine perception components (e.g. human pose estimation, speech recognition, and vehicle tracking), the range of possible application domains is wide: multimedia retrieval, robotics, ambient assisted living, intelligent user interfaces, surveillance, and more. This study focuses on staff exercises in crisis response control rooms (Fig. 2). Trainees take on the roles of control room staff while others simulate field units, crisis dynamics, distress calls, and radio communications. Storylines for such exercises are concerned with events such as large traffic accidents, widespread fires, or floodings.
In today’s staff exercises, individual and task oriented feedback is hard to provide. Automatically generated behavior reports can improve this situation. They could be used to assess the performance of the individual participants: How close did they follow standard operating procedures? Who should have been part of which group? How long did it take them to complete specific tasks? To enable automatic report generation, the simulated crisis dynamics, field units, etc. need to be modeled, but also the situation within the control room, which was the focus of this study. To achieve this, the developed system aims to recognize group behavior by modeling and recognizing the different types of person-person interaction and person-object interaction in various group formations.
The approach can be separated into three parts: perception, reasoning, and evaluation. Part one, perception, was performed using annotated input data. Hypothetical machine perception outputs were annotated based on real audiovisual data using a self-developed data annotation tool. The staff members and objects in the control room (notepads, displays, tables, etc.) were recorded using five cameras and four microphones. An annotator analyzed the audiovisual data in order to mimic them in symbolic form.
Part two, reasoning, was performed using fuzzy metric temporal logic (FMTL) and situation graph trees (SGTs). Situation descriptions were generated describing common group interactions in staff exercises. Optionally, FMTL rule parameters could be optimized through maximization of an adapted F-score. Another option was to use an adapted clustering algorithm as preprocessing. Clustering enriches the person descriptions in the annotated input data with cluster membership information based on their positions in the room. This simplified the subsequent reasoning process and it allowed for a more intuitive approach, yielding more consistent and less redundant results. Clustering can also lead to better runtimes.
Part three, evaluation, was performed by quantitatively comparing reasoning results to ground-truth that was created using a self-developed ground-truth annotation tool. The main performance measures were precision, recall, and F-score, plotted over different truth value thresholds. The evaluation also contained a runtime analysis, error analysis, experiments on noisy data, measurements of inter-annotator agreement, the effect of a new clustering parameter, and the effect of parameter learning.
The contributions of this PhD thesis come from its unique application domain and from the reasoning models developed for it. Furthermore, methodological changes were made to add clustering and parameter learning to the FMTL/SGT framework. These contributions are supported by the development of an enabling software toolkit and a thorough evaluation. FMTL and SGTs were applied to this type of interaction analysis for the first time.
Supported by Fraunhofer-Gesellschaft Internal Programs Grant 692 026.
[] J. IJsselmuiden, Interaction analysis in smart work environments through fuzzy temporal logic, PhD thesis, Karlsruhe Institute of Tech., Dep. Informatics, 2014.
[] J. IJsselmuiden, D. Münch, A.-K. Grosselfinger, M. Arens and R. Stiefelhagen, Automatic understanding of group behavior using fuzzy temporal logic, Journal of Ambient Intelligence and Smart Environments (JAISE) 6: (6) ((2014) ), 623–649.
[] J. Dornheim and J. IJsselmuiden (advisor), Methodische Erweiterung einer Prozesskette zur Interaktionsanalyse mittels unscharfer temporaler Logik, Master’s thesis, Karlsruhe University of Applied Sciences, Faculty Computer Science and Business Inf. Sys., 2014.
[] J. IJsselmuiden, A.-K. Grosselfinger, D. Münch, M. Arens and R. Stiefelhagen, Automatic behavior understanding in crisis response control rooms, in: Conference on Ambient Intelligence (AmI), (2012) , pp. 97–112.
[] J. IJsselmuiden and R. Stiefelhagen, Towards high-level human activity recognition through computer vision and temporal logic, in: German Conference on Artificial Intelligence (KI), (2010) , pp. 426–435.
[] D. Münch, J. IJsselmuiden, A.-K. Grosselfinger, M. Arens and R. Stiefelhagen, Rule-based high-level situation recognition from incomplete tracking data, in: Symposium on Rules (RuleML), (2012) .
[] A. Schick, F. van de Camp, J. IJsselmuiden and R. Stiefelhagen, Extending touch: Towards interaction with large-scale surfaces, in: ACM Interactive Tabletops and Surfaces Conference (ITS), (2009) , pp. 117–124.
[] D. Reich, F. Putze, D. Heger, J. IJsselmuiden, R. Stiefelhagen and T. Schultz, A real-time speech command detector for a smart control room, in: Conf. o. t. Int. Speech Comm. Assoc. (INTERSPEECH), (2011) , pp. 2641–2644.
[] M. Voit, F. van de Camp, J. IJsselmuiden, A. Schick and R. Stiefelhagen, Visuelle Perzeption für die multimodale Mensch-Maschine-Interaktion in und mit aufmerksamen Räumen, at Automatisierungst. (2013), 784–791.