A Robust Pitch Extractor Based on DTW Lines and CASA with Application in Noisy Speech Recognition

被引:0
|
作者
Morales-Cordovilla, Juan A. [1 ]
Cabanas-Molero, Pablo
Peinado, Antonio M. [1 ]
Sanchez, Victoria [1 ]
机构
[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain
关键词
pitch extractor; pitch line; CASA; DTW; noise; robust speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a robust pitch extractor with application in Automatic Speech Recognition and based on selecting pitch lines of a tonegram (a representation of the different pitch energies at each frame time). First, the tonegram and its maximum energy regions are extracted and a Dynamic Time Warping algorithm finds the most energetic trajectories or pitch lines from these regions. A second stage estimates the tonegram of the most energetic lines by applying Computational Auditory Scene Analysis rules which reject and group octave-related lines. The mean pitch of the speaker is estimated and the final pitch is estimated by rejecting lines which are outside from the mean pitch. The proposed pitch extractor is evaluated in a novel way - by means of the word accuracy of a Missing Data recognizer on Aurora-2 database.
引用
收藏
页码:197 / 206
页数:10
相关论文
共 50 条
  • [41] Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments
    Bashirpour, Meysam
    Geravanchizadeh, Masoud
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
  • [42] Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments
    Meysam Bashirpour
    Masoud Geravanchizadeh
    EURASIP Journal on Audio, Speech, and Music Processing, 2018
  • [43] Combined Multi-channel NMF-based Robust Beamforming for Noisy Speech Recognition
    Mimura, Masato
    Bando, Yoshiaki
    Shimada, Kazuki
    Sakai, Shinsuke
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2451 - 2455
  • [44] MEAN NORMALIZATION OF POWER FUNCTION BASED CEPSTRAL COEFFICIENTS FOR ROBUST SPEECH RECOGNITION IN NOISY ENVIRONMENT
    Baek, Soonho
    Kang, Hong-Goo
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [45] ROBUST FRONT-END PROCESSING FOR SPEECH RECOGNITION IN NOISY CONDITIONS
    Das, Biswajit
    Panda, Ashish
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5235 - 5239
  • [46] Speech Recognition Based on Efficient DTW Algorithm and Its DSP Implementation
    Jing XinXing
    Shi Xu
    2012 INTERNATIONAL WORKSHOP ON INFORMATION AND ELECTRONICS ENGINEERING, 2012, 29 : 832 - 836
  • [47] Auditory model for robust speech recognition in real world noisy environments
    Kim, DS
    Lee, SY
    Kil, RM
    Zhu, XL
    ELECTRONICS LETTERS, 1997, 33 (01) : 12 - 13
  • [48] Blind source extraction for robust speech recognition in multisource noisy environments
    Nesta, Francesco
    Matassoni, Marco
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 703 - 725
  • [49] Robust Front-End Processing For Emotion Recognition In Noisy Speech
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Kopparapu, Sunil Kumar
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 324 - 328
  • [50] ROBUST SPEECH RECOGNITION UNDER NOISY ENVIRONMENTS USING ASYMMETRIC TAPERS
    Alam, Md Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1638 - 1642