Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition

被引:5
|
作者
Sun, Yanqing [1 ]
Zhou, Yu [1 ]
Zhao, Qingwei [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIT Speech Lab, Beijing 100864, Peoples R China
来源
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金;
关键词
mismatched speech; robust speech recognition; F-Ratio; subband design; feature optimization;
D O I
10.1587/transinf.E93.D.2417
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper focuses on the problem of performance degradation in mismatched speech recognition. The F-Ratio analysis method is utilized to analyze the significance of different frequency bands for speech unit classification, and we find that frequencies around 1 kHz and 3 kHz, which are the upper bounds of the first and the second formants for most of the vowels, should be emphasized in comparison to the Mel-frequency cepstral coefficients (MFCC). The analysis result is further observed to be stable in several typical mismatched situations. Similar to the Mel-Frequency scale, another frequency scale called the F-Ratio-scale is thus proposed to optimize the filter bank design for the MFCC features, and make each subband contains equal significance for speech unit classification. Under comparable conditions, with the modified features we get a relative 43.20% decrease compared with the MFCC in sentence error rate for the emotion affected speech recognition, 35.54%, 23.03% for the noisy speech recognition at 15 dB and 0 dB SNR (signal to noise ratio) respectively, and 64.50% for the three years' 863 test data. The application of the F-Ratio analysis on the clean training set of the Aurora2 database demonstrates its robustness over languages, texts and sampling rates.
引用
收藏
页码:2417 / 2430
页数:14
相关论文
共 50 条
  • [21] A robust feature selection method based on meta-heuristic optimization for speech emotion recognition
    Bagadi, Kesava Rao
    Sivappagari, Chandra Mohan Reddy
    EVOLUTIONARY INTELLIGENCE, 2024, 17 (02) : 993 - 1004
  • [22] A Novel Acoustic Feature Extraction Algorithm Based on Root Cepstrum Coefficients and CCBC for Robust Speech Recognition
    Wang, Xu
    Han, Zhiyan
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL I, PROCEEDINGS, 2008, : 643 - 647
  • [23] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [24] A New Feature Extraction Method for Bone-conducted Life Sounds based on F-ratio
    An, Yeteng
    Wang, Hongcui
    Hyon, Songgun
    Chen, Sai
    Dang, Jianwu
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1598 - 1604
  • [25] Novel robust feature extraction based on spectrally masked channel energy ratio (SMaChER) for speech recognition
    Ma, CX
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 37 - 40
  • [26] Speech Emotion Recognition Based on Multi Acoustic Feature Fusion
    Xiang, Shanshan
    Anwer, Sadiyagul
    Yilahun, Hankiz
    Hamdulla, Askar
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 338 - 346
  • [27] Speech Recognition Based on Concatenated Acoustic Feature and LightGBM Model
    Yu, Jiali
    Qu, Yuanyuan
    Zhang, Zhongkai
    Lu, Qidong
    Qin, Zhiliang
    Liu, Xiaowei
    TWELFTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2021, 11719
  • [28] Model-based feature compensation for robust speech recognition
    Shen, Haifeng
    Li, Qunxia
    Guo, Jun
    Liu, Gang
    FUNDAMENTA INFORMATICAE, 2006, 72 (04) : 529 - 539
  • [29] Model-based feature compensation for robust speech recognition
    School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing, 100876, China
    不详
    不详
    Fundam Inf, 2006, 4 (529-539):
  • [30] A robust speech recognition based on the feature of weighting combination ZCPA
    Zhang, Xueying
    Liang, Wuzhou
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 3, PROCEEDINGS, 2006, : 361 - +