Multi-talker Verbal Interaction for Humanoid Robots

被引：0

作者：

Klin, Bartlomiej ^{[1
]}

Beniak, Ryszard ^{[1
]}

Podpora, Michal ^{[2
]}

Gardecki, Arkadiusz ^{[1
]}

Rut, Joanna ^{[3
]}

机构：

[1] Opole Univ Technol, Fac Elect Engn Automat Control & Informat, Opole, Poland

[2] Opole Univ Technol, Dept Comp Sci, Opole, Poland

[3] Opole Univ Technol, Fac Prod Engn & Logist, Opole, Poland

来源：

2024 28TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS, MMAR 2024 | 2024年

关键词：

Smart beamforming; Human-Computer Interaction; Software-Hardware Integration for Robot Systems; Long-term Interaction; Multi-Modal Perception for HRI; Natural Dialog for HRI; Design and Human Factors;

D O I：

10.1109/MMAR62187.2024.10680820

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Working in multi-talker mode is viable under certain conditions, such as the fusion of audio and video stimuli along with smart adaptive beamforming of received audio signals. In this article, the authors verify part of the researched novel framework, which focuses on adapting to dynamic interlocutor's location changes in the engagement zone of humanoid robots during the multi-talker conversation. After evaluating the framework, the authors confirm the necessity of a complementary and independent method of increasing the interlocutor's signal isolation accuracy. It is necessary when video analysis performance plummets. The authors described the leading cause as insufficient performance during dynamic conversations. The video analysis cannot derive a new configuration when the interlocutor's speech apparatus moves beyond the expected margin and the video frame rate drops.

引用

页码：521 / 526

页数：6

共 50 条

[1] Multi-talker background and semantic priming effect
Dekerle, Marie
Boulenger, Veronique
Hoen, Michel
Meunier, Fanny
FRONTIERS IN HUMAN NEUROSCIENCE, 2014, 8 : 1 - 13
[2] Modeling speech localization, talker identification, and word recognition in a multi-talker setting
Josupeit, Angela
Hohmann, Volker
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (01): : 35 - 54
[3] Smart Beamforming in Verbal Human-Machine Interaction for Humanoid Robots
Klin, Bartlomiej
Podpora, Michal
Beniak, Ryszard
Gardecki, Arkadiusz
Rut, Joanna
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08) : 4689 - 4696
[4] Multi-Channel Speaker Verification for Single and Multi-talker Speech
Kataria, Saurabh
Zhang, Shi-Xiong
Yu, Dong
INTERSPEECH 2021, 2021, : 4608 - 4612
[5] Target Speaker Extraction for Multi-Talker Speaker Verification
Rao, Wei
Xu, Chenglin
Chng, Eng Siong
Li, Haizhou
INTERSPEECH 2019, 2019, : 1273 - 1277
[6] RECOGNITION OF DIGITIZED CV SYLLABLES IN MULTI-TALKER BABBLE
GORDONSALANT, S
AUDIOLOGY, 1985, 24 (04): : 241 - 253
[7] The influence of informational masking in reverberant, multi-talker environments
1600, Acoustical Society of America (138):
[8] Recognizing Multi-talker Speech with Permutation Invariant Training
Yu, Dong
Chang, Xuankai
Qian, Yanmin
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2456 - 2460
[9] Detection of attention in multi-talker scenarios: A fuzzy approach
Minguillon, Jesus
Angel Lopez-Gordo, M.
Pelayo, Francisco
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 64 : 261 - 268
[10] The influence of informational masking in reverberant, multi-talker environments
Westermann, Adam
Buchholz, Joerg M.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 138 (02): : 584 - 593

← 1 2 3 4 5 →