Comparison and combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system

被引:25
|
作者
Pujol, P
Pol, S
Nadeu, C
Hagen, A
Bourlard, H
机构
[1] Univ Politecn Catalunya, Talp Res Ctr, ES-08034 Barcelona, Spain
[2] INESC ID, Spoken Language Syst Lab L2F, P-1000029 Lisbon, Portugal
[3] IDIAP, CH-1920 Martigny, Switzerland
来源
关键词
frequency filtering (FF); multistream; MLP; product rule; relative spectra (Rasta); robustness;
D O I
10.1109/TSA.2004.834466
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, the advantages of the spectral parameters obtained by frequency filtering (FF) of the logarithmic filter-bank energies (logFBEs) have been reported. These parameters, which are frequency derivatives of the logFBEs, lie in the frequency domain, and have shown good recognition performance with respect to the conventional mel-frequency cepstral coefficients (MFCCs) for hidden Markov models (HMM) based systems. In this paper, the FF features are first compared with the MFCCs and the relative spectral perceptual linear prediction (Rasta-PLP) features using both a hybrid HMM/MLP and a usual HMM/Gaussian mixture models (HMM/GMM) based recognition system, for both clean and noisy speech. Taking advantage of the ability of the hybrid system to deal with correlated features, the inclusion of both the frequency second-derivatives and the raw logFBEs as additional features is proposed and tested. Moreover, the robustness of these features in noisy conditions is enhanced by combining the FF technique with the Rasta temporal filtering approach. Finally, a study of the FF features in the framework of multistream processing is presented. The best recognition results for both clean and noisy speech are obtained from the multistream combination-of the J-Rasta-PLP features and the FF features.
引用
收藏
页码:14 / 22
页数:9
相关论文
共 50 条
  • [21] HMM/MLP hybrid speech recognizer for the Portuguese telephone SpeechDat corpus
    Hagen, A
    Neto, JP
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2003, 2721 : 126 - 134
  • [22] Comparison of acoustical models of GMM-HMM based for speech recognition in Hindi using PocketSphinx
    Manasa, Chadalavada Sai
    Priya, K. Jeeva
    Gupta, Deepa
    [J]. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 534 - 539
  • [23] Peripheral features for HMM-based speech recognition
    Fukuda, T
    Takigawa, M
    Nitta, T
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 129 - 132
  • [24] A GMM/HMM model for reconstruction of missing speech spectral components for continuous speech recognition
    Goodarzi M.M.
    Almasganj F.
    [J]. International Journal of Speech Technology, 2016, 19 (4) : 769 - 777
  • [25] An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques
    Mohammed Jawad Al-Dujaili Al-Khazraji
    Abbas Ebrahimi-Moghadam
    [J]. Wireless Personal Communications, 2024, 134 : 735 - 753
  • [26] An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques
    Al-Khazraji, Mohammed Jawad Al-Dujaili
    Ebrahimi-Moghadam, Abbas
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (02) : 735 - 753
  • [27] A Study on HMM based Speech Recognition System
    Boruah, Saptarshi
    Basishtha, Subhash
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 153 - 157
  • [28] The realization of speech recognition system based on HMM
    Yiao, Mingming
    [J]. 2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 4, 2008, : 24 - 29
  • [29] AN INVESTIGATION ON DNN-DERIVED BOTTLENECK FEATURES FOR GMM-HMM BASED ROBUST SPEECH RECOGNITION
    You, Yongbin
    Qian, Yanmin
    He, Tianxing
    Yu, Kai
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 30 - 34
  • [30] Musical beat recognition using a MLP-HMM hybrid classifier
    Castro, PAC
    Dexter, I
    Garcia, S
    Cajote, RD
    [J]. TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : A104 - A107