Speaker Orientation Estimation based on Hybridation of GCC-PHAT and HLBR

被引:0
|
作者
Segura, Carlos [1 ]
Abad, Alberto [1 ]
Hernando, Javier [1 ]
Nadeu, Climent [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain
关键词
Head orientation; Speaker orientation; Speaker localization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel approach to speaker orientation estimation in a Smart Room environment equipped with multiple microphones. The ratio between the high and low band energies (HLBR) received at each microphone has been shown in our previous work to be a potentially approach to estimate the direction of the voice produced by a speaker. In this work, for each microphone pair, a smoothed CPS phase is obtained by a proper windowing of the main peak of the cross-correlation sequence estimated with the GCC-PHAT method, and a HLBR is computed from the processed CPS. The proposed method keeps the computational simplicity of the HLBR algorithm while adding the robustness offered by the GCC-PHAT technique. Experimental preliminary results were conducted over a database recorded purposely in the UPC Smart room, and over the CLEAR head pose database. The proposed method performs consistently better than other state-of-the-art techniques with both databases.
引用
收藏
页码:1325 / 1328
页数:4
相关论文
共 50 条
  • [31] Centroid Estimation with Transformer-Based Speaker Embedder for Robust Target Speaker Extraction
    Heo, Woon-Haeng
    Maeng, Joongyu
    Kang, Yoseb
    Cho, Namhyun
    INTERSPEECH 2024, 2024, : 4333 - 4337
  • [32] HISTOGRAM BASED DOA ESTIMATION FOR SPEAKER LOCALISATION IN REVERBERANT ENVIRONMENTS
    Trinkle, Matthew
    Hashemi-Sakhtsari, Ahmad
    PROCEEDINGS OF THE 2015 10TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, 2015, : 166 - 170
  • [33] Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles
    Feng Guo
    Yuhang Cao
    Zhaoqiong Huang
    Xing You
    Haixing Guan
    Jiaen Liang
    Baoqing Li
    Circuits, Systems, and Signal Processing, 2019, 38 : 2320 - 2334
  • [34] Speaker Based Vocal Tract Shape Estimation for Kannada Vowels
    Prasad, Shiva K. M.
    Kumar, Anil C.
    Ramaiah, G. N. Kodanda
    Manjunatha, M. B.
    2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, SIGNALS, COMMUNICATION AND OPTIMIZATION (EESCO), 2015,
  • [35] EMAP-based speaker adaptation with robust correlation estimation
    Jon, E
    Kim, DK
    Kim, NS
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 321 - 324
  • [36] Robust correlation estimation for EMAP-Based speaker adaptation
    Jon, E
    Kim, DK
    Kim, NS
    IEEE SIGNAL PROCESSING LETTERS, 2001, 8 (06) : 184 - 186
  • [37] Dimension Reduction Approaches for SVM based Speaker Age Estimation
    Dobry, Gil
    Hecht, Ron M.
    Avigal, Mireille
    Zigel, Yaniv
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1999 - +
  • [38] Speaker adaptation based on MAP estimation using fuzzy controller
    Juang, YT
    Huang, KC
    Ding, IJ
    PATTERN RECOGNITION LETTERS, 2003, 24 (15) : 2807 - 2813
  • [39] Tree-Based Estimation of Speaker Characteristics for Speech Recognition
    Blomberg, Mats
    Elenius, Daniel
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 584 - 587
  • [40] Speaker Direction-of-Arrival Estimation Based on Orthogonal Dipoles
    Guo, Feng
    Cao, Yuhang
    Huang, Zhaoqiong
    You, Xing
    Guan, Haixing
    Liang, Jiaen
    Li, Baoqing
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (05) : 2320 - 2334