A New Subband-Weighted MVDR-Based Front-End for Robust Speech Recognition

被引:4
|
作者
Seyedin, Sanaz [1 ]
Ahadi, Seyed Mohammad [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran 15914, Iran
来源
关键词
feature extraction; robust MVDR power spectral estimation; speech recognition;
D O I
10.1587/transinf.E93.D.2252
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel noise-robust feature extraction method for speech recognition. It is based on making the Minimum Variance Distortion less Response (MVDR) power spectrum estimation method robust against noise. This robustness is obtained by modifying the distortionless constraint of the MVDR spectral estimation method via weighting the sub-band power spectrum values based on the sub-band signal to noise ratios. The optimum weighting is obtained by employing the experimental findings of psychoacoustics. According to our experiments, this technique is successful in modifying the power spectrum of speech signals and making it robust against noise. The above method, when evaluated on Aurora 2 task for recognition purposes, outperformed both the MFCC features as the baseline and the MVDR-based features in different noisy conditions.
引用
收藏
页码:2252 / 2261
页数:10
相关论文
共 50 条
  • [31] Comparing Front-End Enhancement Techniques and Multiconditioned Training for Robust Automatic Speech Recognition
    Soni, Meet H.
    Joshi, Sonal
    Panda, Ashish
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 329 - 340
  • [32] Robust connected digit recognition using speech enhancement and an auditory model front-end
    Flynn, Ronan
    Jones, Edward
    [J]. 2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 410 - +
  • [33] MVDR based feature extraction for robust speech recognition
    Dharanipragada, S
    Rao, BD
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312
  • [34] Automatic Speech Recognition with a Cochlear Implant Front-End
    Nogueira, Waldo
    Harczos, Tamas
    Edler, Bernd
    Ostermann, Joern
    Buechner, Andreas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +
  • [35] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [36] JOINT TRAINING OF FRONT-END AND BACK-END DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Gao, Tian
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4375 - 4379
  • [37] A noise-robust front-end based on tree-structured filter-bank for speech recognition
    Kil, RM
    Kim, YI
    Lee, GH
    [J]. IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 81 - 86
  • [38] Feature enhancement for a bitstream-based front-end in wireless speech recognition
    Kim, HK
    Cox, RV
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 241 - 244
  • [39] A noise robust front-end for speech recognition using hough transform and cumulative distribution mapping
    Choi, Eric H. C.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 286 - +
  • [40] A noise robust front-end with low computational cost for embedded in-car speech recognition
    Ding, Pei
    He, Lei
    Yan, Xiang
    Zhao, Rui
    Hao, Jie
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1045 - +