3D Localization of Multiple Simultaneous Speakers with Discrete Wavelet Transform and Proposed 3D Nested Microphone Array

被引:0
|
作者
Firoozabadi, Ali Dehghan [1 ]
Durney, Hugo [1 ]
Soto, Ismael [2 ]
Olave, Miguel Sanhueza [1 ]
机构
[1] Univ Tecnol Metropolitana, Dept Elect, Av Jose Pedro Alessandri 1242, Santiago 7800002, Chile
[2] Univ Santiago Chile, Elect Engn Dept, Santiago, Chile
关键词
Simultaneous sound source localization; Wavelet Transform; Generalized Cross-Correlation; Nested microphone array; Subband processing;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multiple sound source localization is one of the important topic in speech processing. GCC function is used as a traditional algorithm for sound source localization. This function estimates DOA for multiple speakers by calculation the cross-correlation between microphone signals but its accuracy decreases in adverse conditions. The aim of proposed method in this paper is localization of multiple simultaneous speakers in undesirable condition. The proposed method is based on novel 3D nested microphone array in combination with obtained information of Discrete Wavelet Transform (DWT) and subband processing. The proposed 3D nested microphone array prepares the condition for 3D localization and eliminates the spatial aliasing between microphone signals. Also, we propose the DWT for extraction the information of speech signal. Since, the spectral information of speech signal concentrates on low frequencies, we propose a structure of filter bank based on DWT to increase the frequency resolution on low frequencies. The performed evaluation on real and simulated data shows the superiority of our proposed method in comparison with Fullband and subband processing with uniform filters and uniform microphone array.
引用
收藏
页码:356 / 360
页数:5
相关论文
共 50 条
  • [31] 3D Color Point Cloud Compression with Plane fitting and Discrete Wavelet Transform
    Chithra, P. L.
    Tamilmathi, Christoper A.
    2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 20 - 26
  • [32] HEVC and 3D dual-tree discrete wavelet transform based multiple description video coding
    Chen, Jing
    Liao, Jie
    Yang, Yuhang
    Cai, Canhui
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2016, 16 (04) : 955 - 965
  • [33] Three ring microphone array for 3D sound localization and separation for mobile robot audition
    Tamai, Y
    Sasaki, Y
    Kagami, S
    Mizoguchi, H
    2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 903 - 908
  • [34] 3D Fourier based discrete Radon transform
    Averbuch, A
    Shkolnisky, Y
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2003, 15 (01) : 33 - 69
  • [35] 3D discrete X-ray transform
    Averbuch, A
    Shkolnisky, Y
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2004, 17 (03) : 259 - 276
  • [36] 3D Discrete Shearlet Transform and Video Denoising
    Labate, Demetrio
    Negi, Pooran Singh
    WAVELETS AND SPARSITY XIV, 2011, 8138
  • [37] Sparse 3D Array made from Nested Linear Array Branches for Underdetermined Source Localization
    Yadav, Shekhar Kumar
    George, Nithin, V
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1746 - 1750
  • [38] Efficient Multiscale and Multidirectional Representation of 3D Data using the 3D Discrete Shearlet Transform
    Goossens, Bart
    Luong, Hiep
    Aelterman, Jan
    Pizurica, Aleksandra
    Philips, Wilfried
    WAVELETS AND SPARSITY XIV, 2011, 8138
  • [39] Directional 3D Wavelet Transform Based on Gaussian Mixtures for the Analysis of 3D Ultrasound Ovarian Volumes
    Cigale, Boris
    Zazula, Damjan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (01) : 64 - 77
  • [40] Camera and microphone array for 3D audiovisual face data collection
    Hu, Yuxiao
    Tang, Hao
    Huang, Thomas S.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2161 - 2164