Auditory-based Formant Estimation in Noise using a Probabilistic Framework

被引:0
|
作者
Glaeser, Claudius [1 ]
Heckmann, Martin [1 ]
Joublin, Frank [1 ]
Goerick, Christian [1 ]
机构
[1] Honda Res Inst Europe, D-63073 Offenbach, Germany
关键词
speech processing; formant extraction; tracking; robustness; Bayes procedures;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We recently introduced a computationally efficient framework for tracking formants which combines a biologically inspired preprocessing for enhancing formants in spectrograms with a probabilistic framework for estimating formant trajectories. In contrast to previously published approaches our tracking scheme relies on the joint distribution of formants rather than using independent tracking instances for each formant separately. Therewith more precise formant estimates could be obtained. In this paper we will briefly review our algorithm and extend it by using more sophisticated models of the formants underlying dynamics. Furthermore, we will dwell on the robustness of our method for speech degraded by various types of noise. A comprehensive evaluation on a large publicly available database containing hand-labeled formant trajectories shows significant performance improvements in both clean and noisy speech compared to state of the art approaches.
引用
收藏
页码:2606 / 2609
页数:4
相关论文
共 50 条
  • [1] Speech Enhancement Using Auditory-Based Transform
    Tank, Vanita Raj
    Mahajan, S. P.
    Khaparde, Arti
    Deshpande, Rahul
    2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,
  • [2] ROBUST SPEAKER IDENTIFICATION USING AN AUDITORY-BASED FEATURE
    Li, Qi
    Huang, Yan
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4514 - 4517
  • [3] Formant Estimation and Tracking using Probabilistic Heat-Maps
    Shrem, Yosi t
    Kreuk, Felix
    Keshet, Joseph
    INTERSPEECH 2022, 2022, : 3563 - 3567
  • [4] Formant frequency estimation in noise
    Chen, B
    Loizou, PC
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 581 - 584
  • [5] An auditory-based adaptive speech enhancement system by neural network according to noise intensity
    Choi, J
    Okamoto, J
    Nakajima, S
    Suzuki, Y
    Hosokawa, S
    42ND MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, VOLS 1 AND 2, 1999, : 993 - 996
  • [6] AN AUDITORY-BASED FEATURE FOR ROBUST SPEECH RECOGNITION
    Shao, Yang
    Jin, Zhaozhang
    Wang, DeLiang
    Srinivasan, Soundararajan
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4625 - +
  • [7] Robust classification of stop consonants using auditory-based speech processing
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 81 - 84
  • [8] Robust auditory-based speech processing using the average localized synchrony detection
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05): : 279 - 292
  • [9] Robust Auditory-Based Speech Feature Extraction Using Independent Subspace Method
    Wu, Qiang
    Zhang, Liqing
    Xia, Bin
    ADVANCES IN COGNITIVE NEURODYNAMICS, PROCEEDINGS, 2008, : 405 - +
  • [10] An auditory-based measure for improved phone segment concatenation
    Chappell, DT
    Hansen, JHL
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1639 - 1642