Region Dependent Transform on MLP Features for Speech Recognition

被引:0
|
作者
Ng, Tim [1 ]
Zhang, Bing [1 ]
Matsoukas, Spyros [1 ]
Long Nguyen [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
关键词
Multi-Layer Perceptrons; bottleneck features; Region Dependent Transform; discriminative training; Mandarin speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, Region Dependent Transform (RDT) is used as a feature extraction process to combine the traditional short-term acoustic features with the features derived from Multi-Layer Perceptrons (MLP) which is trained from the long-term features. When compared to the conventional feature augmentation approach, substantial improvement is obtained. Moreover, an improved RDT training procedure in which speaker dependent transforms are take into account is proposed for feature combinination in the Speaker Adaptive Training. By incorporating the higher dimensional features output from the layer prior to the bottleneck layer into our Speech-to-Text (SIT) system using RDT, significant improvement is achieved as compared to using the conventional bottleneck features. In summary, by using the features derived from MLP with RDT, 8.2% to 11.4% relative reduction in Character Error Rate is achieved for our Mandarin STT systems.
引用
收藏
页码:228 / 231
页数:4
相关论文
共 50 条
  • [21] Investigating Low-Distortion Speech Enhancement with Discrete Cosine Transform Features for Robust Speech Recognition
    Tsao, Yu-Sheng
    Hung, Jeih-Weih
    Ho, Kuan-Hsun
    Chen, Berlin
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 131 - 136
  • [22] Applying dynamic context into MLP/HMM speech recognition system
    Salmela, P
    [J]. COMPUTER SPEECH AND LANGUAGE, 2000, 15 (03): : 233 - 255
  • [23] An HMM/MLP hybrid approach for improving discrimination in speech recognition
    Na, K
    Chae, SI
    [J]. IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 156 - 159
  • [24] Speaker-Dependent Bottleneck Features for Egyptian Arabic Speech Recognition
    Romanenko, Aleksei
    Mendelev, Valentin
    [J]. SPEECH AND COMPUTER, 2016, 9811 : 620 - 626
  • [25] MLP-BASED FACTOR ANALYSIS FOR TANDEM SPEECH RECOGNITION
    Ferras, Marc
    Bourlard, Herve
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6719 - 6723
  • [26] A study on recognition of speech based on HMM/MLP hybrid network
    Huang, XY
    Ma, XH
    Li, X
    Fu, YQ
    Lu, JR
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 718 - 721
  • [27] Recent Progress on the Discriminative Region-dependent Transform for Speech Feature Extraction
    Zhang, Bing
    Matsoukas, Spyros
    Schwartz, Richard
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1495 - +
  • [28] Wavelet Transform Based Features Vector Extraction in Isolated Words Speech Recognition System
    Al-Qaraawi, Salih M.
    Mahmood, Sarah Shukur
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON COMMUNICATION SYSTEMS, NETWORKS & DIGITAL SIGNAL PROCESSING (CSNDSP), 2014, : 847 - 850
  • [29] Acoustic features based on auditory model and adaptive fractional Fourier transform for speech recognition
    YIN Hui XIE Xiang~+ KUANG Jingming (Department of Electronic Engineering
    [J]. Chinese Journal of Acoustics, 2011, 30 (04) : 453 - 463