Fast Audio-Visual Multi-Person Tracking for a Humanoid Stereo Camera Head

被引：3

作者：

Nickel, Kai ^{[1
]}

Stiefethagen, Rainer ^{[1
]}

机构：

[1] Univ Karlsruhe, Interact Syst Labs, D-76131 Karlsruhe, Germany

来源：

HUMANOIDS: 2007 7TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS | 2007年

关键词：

D O I：

10.1109/ICHR.2007.4813906

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we present an algorithm for real-time multi-person tracking with a humanoid sensor head featuring a stereo camera and multiple microphones. The proposed algorithm works with a dynamic combination of simple but fast features, which allow us to cope with limited on-board resources. By using a combination of democratic integration and layered sampling it can deal with deficiencies of single features as well as partial occlusion using the very same dynamic fusion mechanism. Both audio and video signals are processed to form a joint attention map of the surroundings. This map allows us to initialize tracks automatic-ally and to control the robot's focus of attention dynamically.

引用

页码：434 / 441

页数：8

共 50 条

[1] Audio-visual Multi-person Tracking for Active Robot Perception
Bayram, Baris
Ince, Gokhan
[J]. 2015 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2015, : 575 - 580
[2] Audio-Visual Variational Fusion for Multi-Person Tracking with Robots
Alameda-Pineda, Xavier
Arias, Soraya
Ban, Yutong
Delorme, Guillaume
Girin, Laurent
Horaud, Radu
Li, Xiaofei
Mourgue, Bastien
Sarrazin, Guillaume
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1059 - 1061
[3] Audio-Visual Multi-person Keyword Spotting via Hybrid Fusion
Su, Yuxin
Miao, Ziling
Liu, Hong
[J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT II, 2022, 13605 : 327 - 338
[4] On-device audio-visual multi-person wake word spotting
Li, Yidi
Wang, Guoquan
Chen, Zhan
Tang, Hao
Liu, Hong
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (04) : 1578 - 1589
[5] Audio-Visual Integration For Human-Robot Interaction in Multi-person Scenarios
Quang Nguyen
Yun, Sang-Seok
Choi, JongSuk
[J]. 2014 IEEE EMERGING TECHNOLOGY AND FACTORY AUTOMATION (ETFA), 2014,
[6] Audio-Visual Perception System for a Humanoid Robotic Head
Viciana-Abad, Raquel
Marfil, Rebeca
Perez-Lorenzo, Jose M.
Bandera, Juan P.
Romero-Garces, Adrian
Reche-Lopez, Pedro
[J]. SENSORS, 2014, 14 (06) : 9522 - 9545
[7] A CLOSER LOOK AT AUDIO-VISUAL MULTI-PERSON SPEECH RECOGNITION AND ACTIVE SPEAKER SELECTION
Braga, Otavio
Siohan, Olivier
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6863 - 6867
[8] A generative approach to audio-visual person tracking
Brunelli, Roberto
Brutti, Alessio
Chippendale, Paul
Lanz, Oswald
Omologo, Maurizio
Svaizer, Piergiorgio
Tobia, Francesco
[J]. MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2007, 4122 : 55 - 68
[9] Real time audio-visual person tracking
Talantzis, Fotios
Pnevmatikakis, Aristodemos
Polymenakos, Lazaros C.
[J]. 2006 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2006, : 243 - +
[10] Multi-camera multi-person tracking for EasyLiving
Krumm, J
Harris, S
Meyers, B
Brumitt, B
Hale, M
Shafer, S
[J]. THIRD IEEE INTERNATIONAL WORKSHOP ON VISUAL SURVEILLANCE, PROCEEDINGS, 2000, : 3 - 10

← 1 2 3 4 5 →