Mel-wiener filter for Mel-LPC based speech recognition

被引:1
|
作者
Islam, Md. Babul [1 ]
Yamamoto, Kazumasa
Matsumoto, Hiroshi
机构
[1] Shinshu Univ, Grad Sch Sci & Technol, Nagano 3800921, Japan
[2] Shinshu Univ, Fac Engn, Nagano 3800921, Japan
[3] Islam Univ, Dept Comp Sci & Engn, Kushtia, Bangladesh
[4] Shinshu Univ, Dept Elect & Elect Engn, Nagano, Japan
[5] Tohoku Univ, Dept Elect Engn, Sendai, Miyagi 980, Japan
来源
关键词
noisy speech recognition; Mel-Wiener filter; Mel-LPC analysis; bilinear transformation; Aurora; 2; database;
D O I
10.1093/ietisy/e90-d.6.935
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a Mel-Wiener filter to enhance Mel-LPC spectra in the presence of additive noise. The transfer function of the proposed filter is defined by using a first-order all-pass filter instead of unit delay. The filter coefficients are estimated based on minimization of the sum of the square error on the linear frequency scale without applying the bilinear transformation and efficiently implemented in the autocorrelation domain. The proposed filter does not require any time-frequency conversion, which saves a large amount of computational load. The performance of the proposed system is comparable to that of ETSI AFE. The optimum filter order is found to be 3, and thus filtering is computationally inexpensive. The computational cost of the proposed system except VAD is 53% of ETSI AFE.
引用
收藏
页码:935 / 942
页数:8
相关论文
共 50 条
  • [41] Application of Teager Energy Operator on Linear and Mel Scales for Whispered Speech Recognition
    Markovic, Branko R.
    Galic, Jovan
    Mijic, Miomir
    [J]. ARCHIVES OF ACOUSTICS, 2018, 43 (01) : 3 - 9
  • [42] Leveraged Mel Spectrograms Using Harmonic and Percussive Components in Speech Emotion Recognition
    Rudd, David Hason
    Huo, Huan
    Xu, Guandong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT II, 2022, 13281 : 392 - 404
  • [43] BANDWIDTH EXTENSION OF TELEPHONE SPEECH USING A FILTER BANK IMPLEMENTATION FOR HIGHBAND MEL SPECTRUM
    Pulakka, Hannu
    Myllyla, Ville
    Laaksonen, Laura
    Alku, Paavo
    [J]. 18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 979 - 983
  • [44] Multilayered network for LPC based speech recognition
    Patil, PB
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1998, 44 (02) : 435 - 438
  • [45] LPC Based Speech Recognition for Kannada Vowels
    Unnibhavi, Anand H.
    Jangamshetti, D. S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 642 - 645
  • [46] Recognition of Human Speech Emotion Using Variants of Mel-Frequency Cepstral Coefficients
    Palo, Hemanta Kumar
    Chandra, Mahesh
    Mohanty, Mihir Narayan
    [J]. ADVANCES IN SYSTEMS, CONTROL AND AUTOMATION, 2018, 442 : 491 - 498
  • [47] Bi-mel-scale frequency cepstrum and its application in telephone speech recognition
    CHEN Jingdong
    XU Bo
    HUANG Taiyi(National Laboratory of Pattern Recognition
    [J]. Chinese Journal of Acoustics, 1998, (03) : 234 - 243
  • [48] Emotion Recognition from Speech Signal Using Mel-Frequency Cepstral Coefficients
    Korkmaz, Onur Erdem
    Atasoy, Ayten
    [J]. 2015 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2015, : 1254 - 1257
  • [49] CELP Speech Coding Based on Mel-Generalized Cepstral Analyses
    [J]. 2000, John Wiley and Sons Inc. (83):
  • [50] Robustness of speech recognition using genetic algorithms and a Mel-cepstral subspace approach
    Selouani, SA
    O'Shaughnessy, D
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 201 - 204