Real-time audio-visual localization of user using microphone array and vision camera

被引:0
|
作者
Choi, C [1 ]
Kong, DG [1 ]
Lee, S [1 ]
Park, K [1 ]
Hong, SG [1 ]
Lee, HK [1 ]
Bang, S [1 ]
Lee, Y [1 ]
Kim, S [1 ]
机构
[1] Samsung Adv Inst Technol, Interact Lab, Yongin 449712, Gyeonggi Do, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In home environments, demands for a robot to serve a user are on the increase, such as cleaning rooms, bringing something to the user, and so on. To achieve these tasks, it is essential for developing a natural way of Human-Robot Interaction (HRI). One of the most natural ways is that the robot approaches the user to do some tasks after recognizing the user's call and localizing its position. In this case, user localization becomes a key technology. In this paper, we propose a novel audio-visual user localization system. It consists of a microphone array with eight sensors and a video camera. Estimating calling direction is achieved by the spectral subtraction of the spatial spectra. In particular, a novel beam-forming method is proposed to suppress the non-stationary audio noises where they always occur in a real world. Furthermore, a robust method for face detection is proposed to double-check the user based on an Adaboost classifier. It is improved to reduce the false alarms remarkably through a new post-processing on face candidates. Successful results in a real home environment show its efficacy and feasibility. The implementation issues, limitations, and their possible solutions are also discussed.
引用
收藏
页码:497 / 502
页数:6
相关论文
共 50 条
  • [31] Real-Time Camera Tracking Using a Global Localization Scheme
    Yue Yiming
    Liang Xiaohui
    Liu Chen
    Liu Jie
    COMPUTER VISION - ACCV 2010 WORKSHOPS, PT II, 2011, 6469 : 21 - 30
  • [32] CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement
    Gogate, Mandar
    Dashtipour, Kia
    Adeel, Ahsan
    Hussain, Amir
    INFORMATION FUSION, 2020, 63 : 273 - 285
  • [33] Development of Near Real-Time Audio-Visual Alarm System for the Philippine Earthquake Intensity Meter
    Merginio, Ivan Jonathan E.
    Christian Marcos, Earl Quinn
    Raquel, Edrianne
    Mark Payawal, John
    Aldrine Uy, Francis
    2019 IEEE 10TH CONTROL AND SYSTEM GRADUATE RESEARCH COLLOQUIUM (ICSGRC), 2019, : 62 - 65
  • [34] Real-Time Decreased Sensitivity to an Audio-Visual Illusion during Goal-Directed Reaching
    Tremblay, Luc
    Nguyen, Thanh
    PLOS ONE, 2010, 5 (01):
  • [35] Using a real-time, tracking microphone array as input to an HMM speech recognizer
    Hughes, TB
    Kim, HS
    DiBiase, JH
    Silverman, HF
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 249 - 252
  • [36] An Implementation of Real-Time Audio Monitoring in Network Camera
    Yuan, Xuehao
    Zhang, Yumeng
    Li, Hui
    MULTIMEDIA AND SIGNAL PROCESSING, 2012, 346 : 412 - 419
  • [37] Real-time Audio Surveillance System for PTZ Camera
    Quoc Nguyen Viet
    Kang, HoSeok
    Chung, Sun-Tae
    Cho, Seongwon
    Lee, Keeseong
    Seol, Tae In
    2013 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2013, : 392 - 397
  • [38] Real-time multiple sound source localization and counting using a soundfield microphone
    Maoshen Jia
    Jundai Sun
    Changchun Bao
    Journal of Ambient Intelligence and Humanized Computing, 2017, 8 : 829 - 844
  • [39] Real-time multiple sound source localization and counting using a soundfield microphone
    Jia, Maoshen
    Sun, Jundai
    Bao, Changchun
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (06) : 829 - 844
  • [40] AUDIO-VISUAL SEARCH IN DEPTH USING 'REAL' AND 'VIRTUAL' ENVIRONMENTS
    Chan, J. S.
    Maguinness, C. T.
    Dobbyn, S.
    McDonald, P.
    Rice, H. J.
    O'Sullivan, C.
    Newell, F. N.
    IRISH JOURNAL OF MEDICAL SCIENCE, 2010, 179 : S114 - S114