Audio-Visual Fusion for Sound Source Localization and Improved Attention

被引:0
|
作者
Lee, Byoung-gi [1 ]
Choi, JongSuk [1 ]
Yoon, SangSuk [2 ]
Choi, Mun-Taek [2 ]
Kim, Munsang [2 ]
Kim, Daijin [3 ]
机构
[1] Korea Inst Sci & Technol, Ctr Cognit Robot Res, Seoul, South Korea
[2] Korea Inst Sci & Technol, Ctr Intelligent Robot, Seoul, South Korea
[3] Postech, Dept Comp Sci & Engn, Pohang, South Korea
关键词
Audio-Vision Fusion; Sound Source Localization; Human Attention; Robot Tracking;
D O I
10.3795/KSME-A.2011.35.7.737
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. Audiovisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audiovision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection.
引用
收藏
页码:737 / 743
页数:7
相关论文
共 50 条
  • [1] Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization
    Um, Sung Jin
    Kim, Dongjin
    Kim, Jung Uk
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3507 - 3516
  • [2] AUDIO-VISUAL DISCREPANCY AND THE INFLUENCE ON VERTICAL SOUND SOURCE LOCALIZATION
    Werner, Stephan
    Liebetrau, Judith
    Sporer, Thomas
    2012 Fourth International Workshop on Quality of Multimedia Experience (QoMEX), 2012, : 133 - 139
  • [3] Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention
    Duan, Bin
    Tang, Hao
    Wang, Wei
    Zong, Ziliang
    Yang, Guowei
    Yan, Yan
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 4012 - 4021
  • [4] Dual Attention Matching for Audio-Visual Event Localization
    Wu, Yu
    Zhu, Linchao
    Yan, Yan
    Yang, Yi
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6301 - 6309
  • [5] Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement
    Wang, Lin
    Sanchez-Matilla, Ricardo
    Cavallaro, Andrea
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5320 - 5325
  • [6] Multi-Attention Audio-Visual Fusion Network for Audio Spatialization
    Zhang, Wen
    Shao, Jie
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 394 - 401
  • [7] Tracking atoms with particles for audio-visual source localization
    Monaci, Gianluca
    Vandergheynst, Pierre
    Maggio, Emilio
    Cavallaro, Andrea
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 753 - +
  • [8] DMMAN: A two-stage audio-visual fusion framework for sound separation and event localization
    Hu, Ruihan
    Zhou, Songbing
    Tang, Zhi Ri
    Chang, Sheng
    Huang, Qijun
    Liu, Yisen
    Han, Wei
    Wu, Edmond Q.
    NEURAL NETWORKS, 2021, 133 : 229 - 239
  • [9] Audio-Visual Grouping Network for Sound Localization from Mixtures
    Mo, Shentong
    Tian, Yapeng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10565 - 10574
  • [10] Audio-visual based non-line-of-sight sound source localization: A feasibility study
    King, E. A.
    Tatoglu, A.
    Iglesias, D.
    Matriss, A.
    APPLIED ACOUSTICS, 2021, 171