Attention to clapping - A direct method for detecting sound source from video and audio

被引：3

作者：

Ikeda, T ^{[1
]}

Ishiguro, IE ^{[1
]}

Asada, M ^{[1
]}

机构：

[1] Osaka Univ, Grad Sch Engn, Dept Adapt Machine Syst, Suita, Osaka 5650871, Japan

来源：

PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS | 2003年

关键词：

D O I：

10.1109/MFI-2003.2003.1232668

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The research approaches utilizing ubiquitous sensors to support human activities have become of major interest lately. One of the required features of the ubiquitous sensor system is paying its attention to our signals, such as clapping hands and uttering keywords. To detect and localize these signs, it is useful to fuse visual and audio information. The sensor fusion in previous works is performed in the task-level layer through individual representations of the sensors. Therefore, it does not provide new information by fusing sensors. This paper proposes another method that fuses sensory signals based on mutual information maximization in the signal-level layer The fused signal provides us new information that cannot be obtained from individual sensors. As an example, this paper shows two experimental results of a sound source localization by audio-visual fusion.

引用

页码：264 / 268

页数：5

共 50 条

[21] BRIDGING HIGH-QUALITY AUDIO AND VIDEO VIA LANGUAGE FOR SOUND EFFECTS RETRIEVAL FROM VISUAL QUERIES
Wilkins, Julia
Salamon, Justin
Fuentes, Magdalena
Bello, Juan Pablo
Nieto, Oriol
2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
[22] An equivalent source method for calculation of the sound radiated from aircraft engines
Holste, F
JOURNAL OF SOUND AND VIBRATION, 1997, 203 (04) : 667 - 695
[23] MANet: a Motion-Driven Attention Network for Detecting the Pulse from a Facial Video with Drastic Motions
Liu, Xuenan
Yang, Xuezhi
Meng, Ziyan
Wang, Ye
Zhang, Jie
Wong, Alexander
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2385 - 2390
[24] From Blind to Guided Audio Source Separation [How models and side information can improve the separation of sound]
Vincent, Emmanuel
Bertin, Nancy
Gribonval, Remi
Bimbot, Frederic
IEEE SIGNAL PROCESSING MAGAZINE, 2014, 31 (03) : 107 - 115
[25] Frequency Component Grouping Based Sound Source Extraction from Mixed Audio Signals Using Spectral Analysis
Hossain, Shifat
Khan, Shadman Sakib
Sunny, Md Samiul Haque
Ahmadi, Mohiuddin
2017 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT 2017), 2017,
[26] The method of video panorama construction from low detail source videos
Timofeev, Boris S.
Obukhova, Natalia A.
Motyko, Alexandr A.
2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS BERLIN (ICCE-BERLIN), 2014, : 360 - 363
[27] Efficient method for calculating sound radiation from a circular source in an infinite baffle
Szemela, Krzysztof
Rdzanek, Wojciech P.
Pawelczyk, Marek
Cheng, Li
JOURNAL OF SOUND AND VIBRATION, 2024, 588
[28] Involuntary attention in children as a function of sound source location: evidence from event-related potentials
Shestakova, A
Ceponiene, R
Huotilainen, M
Yaguchi, K
CLINICAL NEUROPHYSIOLOGY, 2002, 113 (01) : 162 - 168
[29] An Efficient Method for Detecting Electrical Spark and Fire Flame from Real Time Video
Corraya, Sonia
Uddin, Jia
ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2018, 678 : 359 - 368
[30] Source Acquisition Device Identification from Recorded Audio Based on Spatiotemporal Representation Learning with Multi-Attention Mechanisms
Zeng, Chunyan
Feng, Shixiong
Zhu, Dongliang
Wang, Zhifeng
ENTROPY, 2023, 25 (04)

← 1 2 3 4 5 →