Attention to clapping - A direct method for detecting sound source from video and audio

被引:3
|
作者
Ikeda, T [1 ]
Ishiguro, IE [1 ]
Asada, M [1 ]
机构
[1] Osaka Univ, Grad Sch Engn, Dept Adapt Machine Syst, Suita, Osaka 5650871, Japan
关键词
D O I
10.1109/MFI-2003.2003.1232668
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The research approaches utilizing ubiquitous sensors to support human activities have become of major interest lately. One of the required features of the ubiquitous sensor system is paying its attention to our signals, such as clapping hands and uttering keywords. To detect and localize these signs, it is useful to fuse visual and audio information. The sensor fusion in previous works is performed in the task-level layer through individual representations of the sensors. Therefore, it does not provide new information by fusing sensors. This paper proposes another method that fuses sensory signals based on mutual information maximization in the signal-level layer The fused signal provides us new information that cannot be obtained from individual sensors. As an example, this paper shows two experimental results of a sound source localization by audio-visual fusion.
引用
收藏
页码:264 / 268
页数:5
相关论文
共 50 条
  • [1] Audio-Visual Fusion for Sound Source Localization and Improved Attention
    Lee, Byoung-gi
    Choi, JongSuk
    Yoon, SangSuk
    Choi, Mun-Taek
    Kim, Munsang
    Kim, Daijin
    TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS A, 2011, 35 (07) : 737 - 743
  • [2] Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization
    Um, Sung Jin
    Kim, Dongjin
    Kim, Jung Uk
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3507 - 3516
  • [3] Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization
    Rachavarapu, Kranthi Kumar
    Aakanksha, Aakanksha
    Sundaresha, Vignesh
    Rajagopalan, A. N.
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1910 - 1919
  • [4] Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol
    Apostolidis, Konstantinos
    Abesser, Jakob
    Cuccovillo, Luca
    Mezaris, Vasileios
    PROCEEDINGS OF THE 3RD ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA AI AGAINST DISINFORMATION, MAD 2024, 2024, : 30 - 36
  • [5] Detecting semantic concepts from video using temporal gradients and audio classification
    Rautiainen, M
    Seppänen, T
    Penttilä, J
    Peltola, J
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2003, 2728 : 260 - 270
  • [6] Novel Time-Frequency Based Scheme for Detecting Sound Events from Sound Background in Audio Segments
    Hajihashemi, Vahid
    Alavigharahbagh, Abdorreza
    Oliveira, Hugo S.
    Cruz, Pedro Miguel
    Tavares, Joao Manuel R. S.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2021, 2021, 12702 : 402 - 416
  • [7] An Interferometric Method for Detecting a Moving Sound Source with a Vector-Scalar Receiver
    I. V. Kaznacheev
    G. N. Kuznetsov
    V. M. Kuz’kin
    S. A. Pereselkov
    Acoustical Physics, 2018, 64 : 37 - 48
  • [8] An Interferometric Method for Detecting a Moving Sound Source with a Vector-Scalar Receiver
    Kaznacheev, I. V.
    Kuznetsov, G. N.
    Kuz'kin, V. M.
    Pereselkov, S. A.
    ACOUSTICAL PHYSICS, 2018, 64 (01) : 37 - 48
  • [9] A hybrid method of detecting flame from video stream
    Dou, Zengfa
    Ma, Xiaoke
    Xie, Xianghua
    Liu, Hui
    Guo, Chubing
    IET IMAGE PROCESSING, 2022, 16 (11) : 2937 - 2946
  • [10] Self-Supervised Sound Promotion Method of Sound Localization from Video
    Li, Yang
    Zhao, Xiaoli
    Zhang, Zhuoyao
    ELECTRONICS, 2023, 12 (17)