Binaural Sound Source Distance Estimation and Localization for a Moving Listener

被引:4
|
作者
Krause, Daniel Aleksander [1 ]
Garcia-Barrios, Guillermo [2 ]
Politis, Archontis [1 ]
Mesaros, Annamaria [1 ]
机构
[1] Tampere Univ, Comp Sci, Tampere 33720, Finland
[2] Univ Politecn Madrid, Grp Acoust & MultiMedia Applicat, Madrid 33014, Spain
基金
芬兰科学院;
关键词
Sound source localization; sound distance estimation; binaural audio; DEEP NEURAL-NETWORKS; POSITION ESTIMATION; HEAD MOVEMENTS; ROBUST; MODEL;
D O I
10.1109/TASLP.2023.3346297
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the tasks of binaural source distance estimation (SDE) and direction-of-arrival estimation (DOAE) using motion-based cues in a scenario with a walking listener. On top of performing both tasks as separate problems, we study two methods of solving the joint task of simultaneous source distance estimation and localization (SDEL), with a single model. Experiments are conducted for three different scenarios: a static receiver; a static receiver with a rotating head; and a freely moving listener inside a room. The study proposes rotation and translation features to include information about the receiver's motion during model training and studies the effects of these on the final performance. The work includes extended simulation of three datasets containing numerous testing scenarios for sound sources, covering a wide range of DOAs and a source-to-receiver distance up to 15 m. Results are further analyzed with respect to room reverberation, walking speed, as well as source-to-receiver distance. The presented outcomes show large improvements in both DOA and distance estimation for a model that uses motion-based cues as compared with a static scenario. These include a decrease of 9.50(degrees) in DOA and 1.56 m in distance errors for a joint model, followed by 16.17(degrees) and 0.17 m for separate models.
引用
收藏
页码:996 / 1011
页数:16
相关论文
共 50 条
  • [21] Binaural sound source localization based on weighted template matching
    Liu, Hong
    Sun, Yongheng
    Yang, Ge
    Chen, Yang
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2021, 6 (02) : 214 - 223
  • [22] Real-time binaural azimuthal sound source localization
    Ponca, M
    Scarbata, G
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL III, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING I, 2002, : 350 - 355
  • [23] Distance perception of a virtual sound source synthesized near the listener position
    Dong-Soo Kang
    Jung-Woo Choi
    William L. Martens
    Multimedia Tools and Applications, 2016, 75 : 5161 - 5182
  • [24] Distance perception of a virtual sound source synthesized near the listener position
    Kang, Dong-Soo
    Choi, Jung-Woo
    Martens, William L.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (09) : 5161 - 5182
  • [25] MINIMUM AUDITORY MOVEMENT ANGLE - BINAURAL LOCALIZATION OF MOVING SOUND SOURCES
    PERROTT, DR
    MUSICANT, AD
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (06): : 1463 - 1466
  • [26] Binaural Source Localization by Joint Estimation of ILD and ITD
    Raspaud, Martin
    Viste, Harald
    Evangelista, Gianpaolo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (01): : 68 - 77
  • [27] Robust binaural sound source localization based on sub-band SNR estimation and soft decision
    Zhou, Lin
    Zhao, Xiaoyan
    Cheng, Xu
    Li, Nijun
    Wu, Zhenyang
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2015, 45 (04): : 619 - 624
  • [28] MONAURAL-BINAURAL MINIMUM AUDIBLE ANGLES FOR A MOVING SOUND SOURCE
    HARRIS, JD
    SERGEANT, RL
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1971, 14 (03): : 618 - &
  • [29] Binaural Sound Source Distance Reproduction Based on Distance Variation Function and Artificial Reverberation
    Xu, Jiawang
    Wang, Xiaochen
    Zhang, Maosheng
    Yang, Cheng
    Gao, Ge
    MULTIMEDIA MODELING, MMM 2017, PT II, 2017, 10133 : 101 - 111
  • [30] Moving sound source localization in large areas
    Pertilä, P
    Parviainen, M
    Korhonen, T
    Visa, A
    ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 745 - 748