AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

被引：7

作者：

Marras, Mirko ^{[1
]}

Marin-Reyes, Pedro A. ^{[2
]}

Lorenzo-Navarro, Javier ^{[2
]}

Castrillon-Santana, Modesto ^{[2
]}

Fenu, Gianni ^{[1
]}

机构：

[1] Univ Cagliari, Dept Math & Comp Sci, V Osped 72, I-09124 Cagliari, Italy

[2] Univ Las Palmas Gran Canaria, Inst Univ Sistemas Inteligentes & Aplicac Numer I, Campus Univ Tafira, Las Palmas Gran Canaria 35017, Spain

来源：

ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS | 2019年

关键词：

Face-voice Dataset; Deep Learning; People Verification; People Re-Identification; Human-Robot Interaction; PERSON REIDENTIFICATION;

D O I：

10.5220/0007690902550265

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.

引用

页码：255 / 265

页数：11

共 50 条

[31] Visual tracking of silhouettes for human-robot interaction
Menezes, P
Brèthes, L
Lerasle, F
Danès, P
Dias, J
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS 2003, VOL 1-3, 2003, : 971 - 976
[32] Fuzzy visual detection for human-robot interaction
Shieh, Ming-Yuan
Hsieh, Chung-Yu
Hsieh, Tsung-Min
ENGINEERING COMPUTATIONS, 2014, 31 (08) : 1709 - 1719
[33] Formal Verification of Human-Robot Interaction in Healthcare Scenarios
Lestingi, Livia
Askarpour, Mehrnoosh
Bersani, Marcello M.
Rossi, Matteo
SOFTWARE ENGINEERING AND FORMAL METHODS, SEFM 2020, 2020, 12310 : 303 - 324
[34] Formal Verification for Human-Robot Interaction in Medical Environments
Choi, Benjamin J.
Park, Juyoun
Park, Chung Hyuk
HRI '21: COMPANION OF THE 2021 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2021, : 181 - 185
[35] OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People
Ivanko, Denis
Ryumin, Dmitry
Axyonov, Alexandr
Kashevnik, Alexey
Karpov, Alexey
SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 163 - 173
[36] Audio-Visual Kinship Verification: A New Dataset and a Unified Adaptive Adversarial Multimodal Learning Approach
Wu, Xiaoting
Zhang, Xueyi
Feng, Xiaoyi
Lopez, Miguel Bordallo
Liu, Li
IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1523 - 1536
[37] Improvement of Audio-Visual Score Following In Robot Ensemble with Human Guitarist
Itohara, Tatsuhiko
Nakadai, Kazuhiro
Ogata, Tetsuya
Okuno, Hiroshi G.
2012 12TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2012, : 574 - 579
[38] Sporadic Audio-Visual Embodied Assistive Robot Navigation For Human Tracking
Singh, Gaurav
Ghanem, Paul
Padir, Taskin
PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 99 - 105
[39] Visual recognition of pointing gestures for human-robot interaction
Nickel, Kai
Stiefelhagen, Rainer
IMAGE AND VISION COMPUTING, 2007, 25 (12) : 1875 - 1884
[40] Visual Exploration and Analysis of Human-Robot Interaction Rules
Zhang, Hui
Boyles, Michael J.
VISUALIZATION AND DATA ANALYSIS 2013, 2013, 8654

← 1 2 3 4 5 →