Real-time audio-visual localization of user using microphone array and vision camera

被引:0
|
作者
Choi, C [1 ]
Kong, DG [1 ]
Lee, S [1 ]
Park, K [1 ]
Hong, SG [1 ]
Lee, HK [1 ]
Bang, S [1 ]
Lee, Y [1 ]
Kim, S [1 ]
机构
[1] Samsung Adv Inst Technol, Interact Lab, Yongin 449712, Gyeonggi Do, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In home environments, demands for a robot to serve a user are on the increase, such as cleaning rooms, bringing something to the user, and so on. To achieve these tasks, it is essential for developing a natural way of Human-Robot Interaction (HRI). One of the most natural ways is that the robot approaches the user to do some tasks after recognizing the user's call and localizing its position. In this case, user localization becomes a key technology. In this paper, we propose a novel audio-visual user localization system. It consists of a microphone array with eight sensors and a video camera. Estimating calling direction is achieved by the spectral subtraction of the spatial spectra. In particular, a novel beam-forming method is proposed to suppress the non-stationary audio noises where they always occur in a real world. Furthermore, a robust method for face detection is proposed to double-check the user based on an Adaboost classifier. It is improved to reduce the false alarms remarkably through a new post-processing on face candidates. Successful results in a real home environment show its efficacy and feasibility. The implementation issues, limitations, and their possible solutions are also discussed.
引用
收藏
页码:497 / 502
页数:6
相关论文
共 50 条
  • [41] Real-time integral imaging pickup system using camera array
    Xing, Yan
    Xiong, Zhao-Long
    Zhao, Min
    Wang, Qiong-Hua
    ADVANCES IN DISPLAY TECHNOLOGIES VIII, 2018, 10556
  • [42] SMART-I2: "SPATIAL MULTI-USER AUDIO-VISUAL REAL-TIME INTERACTIVE INTERFACE", A BROADCAST APPLICATION CONTEXT.
    Rebillat, Marc
    Katz, Brian F. G.
    Corteel, Etienne
    2009 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO, 2009, : 269 - +
  • [43] Real-time 2 dimensional sound source localization by 128-channel huge microphone array
    Tamai, Y
    Kagami, S
    Mizoguchi, H
    Amemiya, Y
    Nagashima, K
    Takano, T
    RO-MAN 2004: 13TH IEEE INTERNATIONAL WORKSHOP ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, PROCEEDINGS, 2004, : 65 - 70
  • [44] Real-time sound source orientation estimation using a 96 channel microphone array
    Nakajima, Hirofumi
    Kikuchi, Keiko
    Daigo, Toru
    Kaneda, Yutaka
    Nakadai, Kazuhiro
    Hasegawa, Yuji
    2009 IEEE-RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2009, : 676 - 683
  • [45] Performance of an HMM speech recognizer using a real-time tracking microphone array as input
    Hughes, TB
    Kim, HS
    DiBiase, JH
    Silverman, HF
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (03): : 346 - 349
  • [46] REAL-TIME MULTIPLE SOUND SOURCE LOCALIZATION USING A CIRCULAR MICROPHONE ARRAY BASED ON SINGLE-SOURCE CONFIDENCE MEASURES
    Pavlidi, Despoina
    Puigt, Matthieu
    Griffin, Anthony
    Mouchtaris, Athanasios
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2625 - 2628
  • [47] REAL-TIME AUDIO MIXING - USER-DIRECTED AUDIO ON-THE-FLY
    OUIMETTE, S
    CD-ROM PROFESSIONAL, 1995, 8 (09): : 38 - 40
  • [48] Real-time audio and visual display of the Coronavirus genome
    Mark D. Temple
    BMC Bioinformatics, 21
  • [49] Real-time audio and visual display of the Coronavirus genome
    Temple, Mark D.
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [50] Real-time unusual user event detection algorithm fusing vision, audio, activity, and dust patterns
    Juho Jung
    Ryumduk Oh
    Gwang Lee
    Junho Ahn
    Multimedia Tools and Applications, 2021, 80 : 35773 - 35788