Automating Gaze Target Annotation in Human-Robot Interaction

被引:0
|
作者
Cheng, Linlin [1 ]
Hindriks, Koen V. [1 ]
Belopolsky, Artem V. [2 ]
机构
[1] Vrije Univ Amsterdam, Fac Sci, Comp Sci, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Human Movement Sci, Amsterdam, Netherlands
关键词
D O I
10.1109/RO-MAN60168.2024.10731455
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying gaze targets in videos of human-robot interaction is useful for measuring engagement. In practice, this requires manually annotating for a fixed set of objects that a participant is looking at in a video, which is very time-consuming. To address this issue, we propose an annotation pipeline for automating this effort. In this work, we focus on videos in which the objects looked at do not move. As input for the proposed pipeline, we therefore only need to annotate object bounding boxes for the first frame of each video. The benefit, moreover, of manually annotating these frames is that we can also draw bounding boxes for objects outside of it, which enables estimating gaze targets in videos where not all objects are visible. A second issue that we address is that the models used for automating the pipeline annotate individual video frames. In practice, however, manual annotation is done at the event level for video segments instead of single frames. Therefore, we also introduce and investigate several variants of algorithms for aggregating frame-level to event-level annotations, which are used in the last step in our annotation pipeline. We compare two versions of our pipeline: one that uses a state-of-the-art gaze estimation model (GEM) and a second one using a state-of-the-art target detection model (TDM). Our results show that both versions successfully automate the annotation, but the GEM pipeline performs slightly (approximate to 10%) better for videos where not all objects are visible. Analysis of our aggregation algorithm, moreover, shows that there is no need for manual video segmentation because a fixed time interval for segmentation yields very similar results. We conclude that the proposed pipeline can be used to automate almost all of the annotation effort.
引用
收藏
页码:991 / 998
页数:8
相关论文
共 50 条
  • [31] Classification of Visual Interest based on Gaze and Facial Features for Human-robot Interaction
    Sorensen, Andreas Risskov
    Palinko, Oskar
    Krueger, Norbert
    HUCAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 2: HUCAPP, 2021, : 198 - 204
  • [32] Towards Automated Human-Robot Mutual Gaze
    Broz, Frank
    Kose-Bagci, Hatice
    Nehaniv, Chrystopher L.
    Dautenhahn, Kerstin
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER-HUMAN INTERACTIONS (ACHI 2011), 2011, : 222 - 227
  • [33] Human-robot interaction and robot control
    Sequeira, Joao
    Ribeiro, Maria Isabel
    ROBOT MOTION AND CONTROL: RECENT DEVELOPMENTS, 2006, 335 : 375 - 390
  • [34] On Interaction Quality in Human-Robot Interaction
    Bensch, Suna
    Jevtic, Aleksandar
    Hellstrom, Thomas
    ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2017, : 182 - 189
  • [35] Flexible Assimilation of Human's Target for Versatile Human-Robot Physical Interaction
    Takagi, Atsushi
    Li, Yanan
    Burdet, Etienne
    IEEE TRANSACTIONS ON HAPTICS, 2021, 14 (02) : 421 - 431
  • [36] Toward understanding social cues and signals in human-robot interaction: effects of robot gaze and proxemic behavior
    Fiore, Stephen M.
    Wiltshire, Travis J.
    Lobato, Emilio J. C.
    Jentsch, Florian G.
    Huang, Wesley H.
    Axelrod, Benjamin
    FRONTIERS IN PSYCHOLOGY, 2013, 4
  • [37] The Effect of Multiple Robot Interaction on Human-Robot Interaction
    Yang, Jeong-Yean
    Kwon, Dong-Soo
    2012 9TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAL), 2012, : 30 - 33
  • [38] Human-Robot Proxemics: Physical and Psychological Distancing in Human-Robot Interaction
    Mumm, Jonathan
    Mutlu, Bilge
    PROCEEDINGS OF THE 6TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTIONS (HRI 2011), 2011, : 331 - 338
  • [39] Learning where to look Autonomous development of gaze behavior for natural human-robot interaction
    Mohammad, Yasser
    Nishida, Toyoaki
    INTERACTION STUDIES, 2013, 14 (03) : 419 - 450
  • [40] Age-Related Differences in the Perception of Robotic Referential Gaze in Human-Robot Interaction
    Morillo-Mendez, Lucas
    Schrooten, Martien G. S.
    Loutfi, Amy
    Mozos, Oscar Martinez
    INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2024, 16 (06) : 1069 - 1081