Real-time audio-visual localization of user using microphone array and vision camera

被引:0
|
作者
Choi, C [1 ]
Kong, DG [1 ]
Lee, S [1 ]
Park, K [1 ]
Hong, SG [1 ]
Lee, HK [1 ]
Bang, S [1 ]
Lee, Y [1 ]
Kim, S [1 ]
机构
[1] Samsung Adv Inst Technol, Interact Lab, Yongin 449712, Gyeonggi Do, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In home environments, demands for a robot to serve a user are on the increase, such as cleaning rooms, bringing something to the user, and so on. To achieve these tasks, it is essential for developing a natural way of Human-Robot Interaction (HRI). One of the most natural ways is that the robot approaches the user to do some tasks after recognizing the user's call and localizing its position. In this case, user localization becomes a key technology. In this paper, we propose a novel audio-visual user localization system. It consists of a microphone array with eight sensors and a video camera. Estimating calling direction is achieved by the spectral subtraction of the spatial spectra. In particular, a novel beam-forming method is proposed to suppress the non-stationary audio noises where they always occur in a real world. Furthermore, a robust method for face detection is proposed to double-check the user based on an Adaboost classifier. It is improved to reduce the false alarms remarkably through a new post-processing on face candidates. Successful results in a real home environment show its efficacy and feasibility. The implementation issues, limitations, and their possible solutions are also discussed.
引用
收藏
页码:497 / 502
页数:6
相关论文
共 50 条
  • [1] Real-time speaker localization and speech separation by audio-visual integration
    Nakadai, K
    Hidai, K
    Okuno, HG
    Kitano, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, 2002, : 1043 - 1049
  • [2] Real-Time Human Intrusion Detection Using Audio-Visual Fusion
    Wang, Defu
    Zheng, Shibao
    Zhang, Chongyang
    ADVANCES ON DIGITAL TELEVISION AND WIRELESS MULTIMEDIA COMMUNICATIONS, 2012, 331 : 82 - 89
  • [3] Audio-Visual Beamforming with the Eigenmike Microphone Array an Omni-Camera and Cognitive Auditory Features
    Mendat, Daniel R.
    West, James E.
    Ramenahalli, Sudarshan
    Niebur, Ernst
    Andreou, Andreas G.
    2017 51ST ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2017,
  • [4] Real-Time Audio-Visual Analysis for Multiperson Videoconferencing
    Motlicek, Petr
    Duffner, Stefan
    Korchagin, Danil
    Bourlard, Herve
    Scheffler, Carl
    Odobez, Jean-Marc
    Del Galdo, Giovanni
    Kallinger, Markus
    Thiergart, Oliver
    ADVANCES IN MULTIMEDIA, 2013, 2013
  • [5] Real-time sound source localization and separation based on active audio-visual integration
    Okuno, HG
    Nakadai, K
    COMPUTATIONAL METHODS IN NEURAL MODELING, PT 1, 2003, 2686 : 118 - 125
  • [6] Real-time monitoring of participants' interaction in a meeting using audio-visual sensors
    Busso, Carlos
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 685 - +
  • [7] A Real-Time Text to Audio-Visual Speech Synthesis System
    Wang, Lijuan
    Qian, Xiaojun
    Ma, Lei
    Qian, Yao
    Chen, Yining
    Soong, Frank
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2338 - +
  • [8] Real-time audio-visual composition: Mugenkei as a Live Dream
    Jentzsch, Wilfried
    Detheux, Jean
    INTERNATIONAL JOURNAL OF ARTS AND TECHNOLOGY, 2009, 2 (1-2) : 129 - 132
  • [9] Real-time Audio-Visual Media Transport over QUIC
    Perkins, Colin
    Ott, Joerg
    EPIQ'18: PROCEEDINGS OF THE 2018 WORKSHOP ON THE EVOLUTION, PERFORMANCE, AND INTEROPERABILITY OF QUIC, 2018, : 36 - 42
  • [10] Real-Time Idling Vehicles Detection Using Combined Audio-Visual Deep Learning
    Li, Xiwen
    Mangin, Tristalee
    Saha, Surojit
    Mohammed, Rehman
    Blanchard, Evan
    Tang, Dillon
    Poppe, Henry
    Choi, Ouk
    Kelly, Kerry
    Whitaker, Ross
    EMERGING CUTTING-EDGE DEVELOPMENTS IN INTELLIGENT TRAFFIC AND TRANSPORTATION SYSTEMS, ICITT 2023/ICCNT, 2024, 50 : 142 - 158