AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

被引：7

作者：

Marras, Mirko ^{[1
]}

Marin-Reyes, Pedro A. ^{[2
]}

Lorenzo-Navarro, Javier ^{[2
]}

Castrillon-Santana, Modesto ^{[2
]}

Fenu, Gianni ^{[1
]}

机构：

[1] Univ Cagliari, Dept Math & Comp Sci, V Osped 72, I-09124 Cagliari, Italy

[2] Univ Las Palmas Gran Canaria, Inst Univ Sistemas Inteligentes & Aplicac Numer I, Campus Univ Tafira, Las Palmas Gran Canaria 35017, Spain

来源：

ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS | 2019年

关键词：

Face-voice Dataset; Deep Learning; People Verification; People Re-Identification; Human-Robot Interaction; PERSON REIDENTIFICATION;

D O I：

10.5220/0007690902550265

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.

引用

页码：255 / 265

页数：11

共 50 条

[21] Visual Surveillance for Human-Robot Interaction
Martinez-Martin, Ester
del Pobil, Angel P.
PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 3333 - 3338
[22] Does the Appearance of an Agent Affect How We Perceive his/her Voice? Audio-visual Predictive Processes in Human-robot Interaction
Sarigul, Busra
Saltik, Imge
Hokelek, Batuhan
Urgen, Burcu A.
HRI'20: COMPANION OF THE 2020 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2020, : 430 - 432
[23] Human interaction categorization by using audio-visual cues
Marin-Jimenez, M. J.
Munoz-Salinas, R.
Yeguas-Bolivar, E.
Perez de la Blanca, N.
MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 71 - 84
[24] Audio Cells: A Spatial Audio Prototyping Environment for Human-Robot Interaction
Robinson, Frederic Anthony
TEI'20: PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON TANGIBLE, EMBEDDED, AND EMBODIED INTERACTION, 2020, : 955 - 960
[25] Unsupervised cross-modal deep-model adaptation for audio-visual re-identification with wearable cameras
Brutti, Alessio
Cavallaro, Andrea
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 438 - 445
[26] Human interaction categorization by using audio-visual cues
M. J. Marín-Jiménez
R. Muñoz-Salinas
E. Yeguas-Bolivar
N. Pérez de la Blanca
Machine Vision and Applications, 2014, 25 : 71 - 84
[27] Development of an audio-visual database system for human identification
Bargale, CB
Chaudhuri, S
Bhattacharyya, P
AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 345 - 352
[28] Re-Configuring Human-Robot Interaction
Bischof, Andreas
Hornecker, Eva
Krummheuer, Antonia Lina
Rehm, Matthias
PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, : 1234 - 1236
[29] An adaptable architecture for human-robot visual interaction
Anisetti, Marco
Bellandi, Valerio
Damiani, Ernesto
Jeon, Gwanggil
Jeong, Jechang
IECON 2007: 33RD ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, VOLS 1-3, CONFERENCE PROCEEDINGS, 2007, : 119 - +
[30] Towards Visual Dialogue for Human-Robot Interaction
Part, Jose L.
Garcia, Daniel Hernandez
Yu, Yanchao
Gunson, Nancie
Dondrup, Christian
Lemon, Oliver
HRI '21: COMPANION OF THE 2021 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2021, : 670 - 672

← 1 2 3 4 5 →