3D Localization of Multiple Simultaneous Speakers with Discrete Wavelet Transform and Proposed 3D Nested Microphone Array

被引：0

作者：

Firoozabadi, Ali Dehghan ^{[1
]}

Durney, Hugo ^{[1
]}

Soto, Ismael ^{[2
]}

Olave, Miguel Sanhueza ^{[1
]}

机构：

[1] Univ Tecnol Metropolitana, Dept Elect, Av Jose Pedro Alessandri 1242, Santiago 7800002, Chile

[2] Univ Santiago Chile, Elect Engn Dept, Santiago, Chile

来源：

2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2018年

关键词：

Simultaneous sound source localization; Wavelet Transform; Generalized Cross-Correlation; Nested microphone array; Subband processing;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multiple sound source localization is one of the important topic in speech processing. GCC function is used as a traditional algorithm for sound source localization. This function estimates DOA for multiple speakers by calculation the cross-correlation between microphone signals but its accuracy decreases in adverse conditions. The aim of proposed method in this paper is localization of multiple simultaneous speakers in undesirable condition. The proposed method is based on novel 3D nested microphone array in combination with obtained information of Discrete Wavelet Transform (DWT) and subband processing. The proposed 3D nested microphone array prepares the condition for 3D localization and eliminates the spatial aliasing between microphone signals. Also, we propose the DWT for extraction the information of speech signal. Since, the spectral information of speech signal concentrates on low frequencies, we propose a structure of filter bank based on DWT to increase the frequency resolution on low frequencies. The performed evaluation on real and simulated data shows the superiority of our proposed method in comparison with Fullband and subband processing with uniform filters and uniform microphone array.

引用

页码：356 / 360

页数：5

共 50 条

[31] 3D Color Point Cloud Compression with Plane fitting and Discrete Wavelet Transform
Chithra, P. L.
Tamilmathi, Christoper A.
2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 20 - 26
[32] HEVC and 3D dual-tree discrete wavelet transform based multiple description video coding
Chen, Jing
Liao, Jie
Yang, Yuhang
Cai, Canhui
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2016, 16 (04) : 955 - 965
[33] Three ring microphone array for 3D sound localization and separation for mobile robot audition
Tamai, Y
Sasaki, Y
Kagami, S
Mizoguchi, H
2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 903 - 908
[34] 3D Fourier based discrete Radon transform
Averbuch, A
Shkolnisky, Y
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2003, 15 (01) : 33 - 69
[35] 3D discrete X-ray transform
Averbuch, A
Shkolnisky, Y
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2004, 17 (03) : 259 - 276
[36] 3D Discrete Shearlet Transform and Video Denoising
Labate, Demetrio
Negi, Pooran Singh
WAVELETS AND SPARSITY XIV, 2011, 8138
[37] Sparse 3D Array made from Nested Linear Array Branches for Underdetermined Source Localization
Yadav, Shekhar Kumar
George, Nithin, V
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1746 - 1750
[38] Efficient Multiscale and Multidirectional Representation of 3D Data using the 3D Discrete Shearlet Transform
Goossens, Bart
Luong, Hiep
Aelterman, Jan
Pizurica, Aleksandra
Philips, Wilfried
WAVELETS AND SPARSITY XIV, 2011, 8138
[39] Directional 3D Wavelet Transform Based on Gaussian Mixtures for the Analysis of 3D Ultrasound Ovarian Volumes
Cigale, Boris
Zazula, Damjan
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (01) : 64 - 77
[40] Camera and microphone array for 3D audiovisual face data collection
Hu, Yuxiao
Tang, Hao
Huang, Thomas S.
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2161 - 2164

← 1 2 3 4 5 →