AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

被引:7
|
作者
Marras, Mirko [1 ]
Marin-Reyes, Pedro A. [2 ]
Lorenzo-Navarro, Javier [2 ]
Castrillon-Santana, Modesto [2 ]
Fenu, Gianni [1 ]
机构
[1] Univ Cagliari, Dept Math & Comp Sci, V Osped 72, I-09124 Cagliari, Italy
[2] Univ Las Palmas Gran Canaria, Inst Univ Sistemas Inteligentes & Aplicac Numer I, Campus Univ Tafira, Las Palmas Gran Canaria 35017, Spain
关键词
Face-voice Dataset; Deep Learning; People Verification; People Re-Identification; Human-Robot Interaction; PERSON REIDENTIFICATION;
D O I
10.5220/0007690902550265
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.
引用
收藏
页码:255 / 265
页数:11
相关论文
共 50 条
  • [1] RoboReID: Audio-Visual Person Re-Identification by Social Robot
    Lu, Zhijing
    Ashok, Ashita
    Berns, Karsten
    2024 10TH IEEE RAS/EMBS INTERNATIONAL CONFERENCE FOR BIOMEDICAL ROBOTICS AND BIOMECHATRONICS, BIOROB 2024, 2024, : 1758 - 1763
  • [2] Human-robot interaction in real environments by audio-visual integration
    Kim, Hyun-Don
    Choi, Jong-Suk
    Kim, Munsang
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2007, 5 (01) : 61 - 69
  • [3] Audio-Visual Speech Recognition for Human-Robot Interaction: a Feasibility Study
    Goetzee, Sander
    Mihhailov, Konstantin
    van de laar, Roel
    Baraka, Kim
    Hindriks, Koen V.
    2024 33RD IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, ROMAN 2024, 2024, : 930 - 935
  • [4] Deep Multi-biometric Fusion for Audio-Visual User Re-Identification and Verification
    Marras, Mirko
    Marin-Reyes, Pedro A.
    Lorenzo-Navarro, Javier
    Castrillon-Santana, Modesto
    Fenu, Gianni
    PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2019), 2020, 11996 : 136 - 157
  • [5] Audio-Visual Integration For Human-Robot Interaction in Multi-person Scenarios
    Quang Nguyen
    Yun, Sang-Seok
    Choi, JongSuk
    2014 IEEE EMERGING TECHNOLOGY AND FACTORY AUTOMATION (ETFA), 2014,
  • [6] Audio-Visual SLAM towards Human Tracking and Human-Robot Interaction in Indoor Environments
    Chau, Aaron
    Sekiguchi, Kouhei
    Nugraha, Aditya Arie
    Yoshii, Kazuyoshi
    Funakoshi, Kotaro
    2019 28TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2019,
  • [7] AUDIO-VISUAL OBJECT CLASSIFICATION FOR HUMAN-ROBOT COLLABORATION
    Xompero, A.
    Pang, Y. L.
    Patten, T.
    Prabhakar, A.
    Calli, B.
    Cavallaro, A.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9137 - 9141
  • [8] Integration of Tracking, Re-Identification, and Gesture Recognition for Facilitating Human-Robot Interaction
    Lee, Sukhan
    Lee, Soojin
    Park, Hyunwoo
    SENSORS, 2024, 24 (15)
  • [9] Neural network based reinforcement learning for audio-visual gaze control in human-robot interaction
    Lathuiliere, Stephane
    Masse, Benoit
    Mesejo, Pablo
    Horaud, Radu
    PATTERN RECOGNITION LETTERS, 2019, 118 (61-71) : 61 - 71
  • [10] Collaborative analysis of audio-visual speech synthesis with sensor measurements for regulating human-robot interaction
    Ashok, K.
    Ashraf, Mohd
    Raja, J. Thimmia
    Hussain, Md Zair
    Singh, Dinesh Kumar
    Haldorai, Anandakumar
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022,