AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

被引:7
|
作者
Marras, Mirko [1 ]
Marin-Reyes, Pedro A. [2 ]
Lorenzo-Navarro, Javier [2 ]
Castrillon-Santana, Modesto [2 ]
Fenu, Gianni [1 ]
机构
[1] Univ Cagliari, Dept Math & Comp Sci, V Osped 72, I-09124 Cagliari, Italy
[2] Univ Las Palmas Gran Canaria, Inst Univ Sistemas Inteligentes & Aplicac Numer I, Campus Univ Tafira, Las Palmas Gran Canaria 35017, Spain
关键词
Face-voice Dataset; Deep Learning; People Verification; People Re-Identification; Human-Robot Interaction; PERSON REIDENTIFICATION;
D O I
10.5220/0007690902550265
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.
引用
收藏
页码:255 / 265
页数:11
相关论文
共 50 条
  • [31] Visual tracking of silhouettes for human-robot interaction
    Menezes, P
    Brèthes, L
    Lerasle, F
    Danès, P
    Dias, J
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS 2003, VOL 1-3, 2003, : 971 - 976
  • [32] Fuzzy visual detection for human-robot interaction
    Shieh, Ming-Yuan
    Hsieh, Chung-Yu
    Hsieh, Tsung-Min
    ENGINEERING COMPUTATIONS, 2014, 31 (08) : 1709 - 1719
  • [33] Formal Verification of Human-Robot Interaction in Healthcare Scenarios
    Lestingi, Livia
    Askarpour, Mehrnoosh
    Bersani, Marcello M.
    Rossi, Matteo
    SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2020, 2020, 12310 : 303 - 324
  • [34] Formal Verification for Human-Robot Interaction in Medical Environments
    Choi, Benjamin J.
    Park, Juyoun
    Park, Chung Hyuk
    HRI '21: COMPANION OF THE 2021 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2021, : 181 - 185
  • [35] OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People
    Ivanko, Denis
    Ryumin, Dmitry
    Axyonov, Alexandr
    Kashevnik, Alexey
    Karpov, Alexey
    SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 163 - 173
  • [36] Audio-Visual Kinship Verification: A New Dataset and a Unified Adaptive Adversarial Multimodal Learning Approach
    Wu, Xiaoting
    Zhang, Xueyi
    Feng, Xiaoyi
    Lopez, Miguel Bordallo
    Liu, Li
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1523 - 1536
  • [37] Improvement of Audio-Visual Score Following In Robot Ensemble with Human Guitarist
    Itohara, Tatsuhiko
    Nakadai, Kazuhiro
    Ogata, Tetsuya
    Okuno, Hiroshi G.
    2012 12TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2012, : 574 - 579
  • [38] Sporadic Audio-Visual Embodied Assistive Robot Navigation For Human Tracking
    Singh, Gaurav
    Ghanem, Paul
    Padir, Taskin
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 99 - 105
  • [39] Visual recognition of pointing gestures for human-robot interaction
    Nickel, Kai
    Stiefelhagen, Rainer
    IMAGE AND VISION COMPUTING, 2007, 25 (12) : 1875 - 1884
  • [40] Visual Exploration and Analysis of Human-Robot Interaction Rules
    Zhang, Hui
    Boyles, Michael J.
    VISUALIZATION AND DATA ANALYSIS 2013, 2013, 8654