An Improved Mel-Wiener Filter for Mel-LPC based Speech Recognition

被引:0
|
作者
Islam, Md. Babul [1 ]
Matsumoto, Hiroshi [1 ]
Yamamoto, Kazumasa [1 ]
机构
[1] Shinshu Univ, Grad Sch Sci & Technol, Matsumoto, Nagano, Japan
关键词
Noisy speech recognition; Mel-Wiener filter; Mel-LPC analysis; Bilinear transformation; Aurora; 2; database;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We previously proposed a Mel-Wiener filter to enhance Mel-LPC spectra in presence of additive noise. The proposed filter was estimated based on minimization of sum of square error on the linear frequency scale and efficiently implemented in the autocorrelation domain without denoising input speech. In the previously proposed system we segregated speech and noise using an energy based VAD and a very simple flooring technique were used for noise segment. In this present work, we improve the VAD using autoregressive (AR) model of noise and flooring technique as well. In addition, a lag window is applied to the estimated noise autocorrelation function to smooth the fine spectra of high order autocorrelation coefficients. As a result, substantial improvement is obtained over previous result.
引用
收藏
页码:45 / 48
页数:4
相关论文
共 50 条
  • [1] Mel-wiener filter for Mel-LPC based speech recognition
    Islam, Md. Babul
    Yamamoto, Kazumasa
    Matsumoto, Hiroshi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (06): : 935 - 942
  • [2] Evaluation of MEL-LPC cepstrum in a large vocabulary continuous speech recognition
    Matsumoto, H
    Moroto, M
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 117 - 120
  • [3] GMM-Based two-stage mel-warped Wiener filter for robust speech recognition
    Lei, Jianjun
    Guo, Jun
    Liu, Gang
    Wang, Jian
    Shen, Halfeng
    Nie, Xiangfei
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 827 - 830
  • [4] Recognition of Subsampled Speech Using a Modified Mel Filter Bank
    Bhuvanagiri, Kiran Kumar
    Kopparapu, Sunil Kumar
    [J]. ADVANCES IN COMPUTING AND COMMUNICATIONS, PT 4, 2011, 193 : 293 - 299
  • [5] Recognition of subsampled speech using a modified Mel filter bank
    Kopparapu, Sunil Kumar
    Bhuvanagiri, Kiran Kumar
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (02) : 655 - 662
  • [6] Robust speech recognition by selecting mel-filter banks
    Wu, Yun-Peng
    Mao, Jia-Min
    Li, Wei-Feng
    [J]. PROCEEDINGS OF THE 2ND ANNUAL INTERNATIONAL CONFERENCE ON ELECTRONICS, ELECTRICAL ENGINEERING AND INFORMATION SCIENCE (EEEIS 2016), 2016, 117 : 407 - 416
  • [7] Improved DTW Speech Recognition Algorithm Based on the MEL Frequency Cepstral Coefficients
    Wei Ming-zhe
    Li Xi
    Ren Li-mian
    [J]. 12TH ANNUAL MEETING OF CHINA ASSOCIATION FOR SCIENCE AND TECHNOLOGY ON INFORMATION AND COMMUNICATION TECHNOLOGY AND SMART GRID, 2010, : 235 - 238
  • [8] Improved speech emotion recognition with Mel frequency magnitude coefficient
    Ancilin, J.
    Milton, A.
    [J]. APPLIED ACOUSTICS, 2021, 179
  • [9] The Improvement and Implementation of Speech Enhancement Based on Mel frequency Wiener Filtering
    Fan Binwen
    Wang Yongjun
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 1814 - 1818
  • [10] Mel-MViTv2: Enhanced Speech Emotion Recognition With Mel Spectrogram and Improved Multiscale Vision Transformers
    Ong, Kah Liang
    Lee, Chin Poo
    Lim, Heng Siong
    Lim, Kian Ming
    Alqahtani, Ali
    [J]. IEEE ACCESS, 2023, 11 : 108571 - 108579