Region Dependent Transform on MLP Features for Speech Recognition

被引:0
|
作者
Ng, Tim [1 ]
Zhang, Bing [1 ]
Matsoukas, Spyros [1 ]
Long Nguyen [1 ]
机构
[1] Raytheon BBN Technol, Cambridge, MA 02138 USA
关键词
Multi-Layer Perceptrons; bottleneck features; Region Dependent Transform; discriminative training; Mandarin speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, Region Dependent Transform (RDT) is used as a feature extraction process to combine the traditional short-term acoustic features with the features derived from Multi-Layer Perceptrons (MLP) which is trained from the long-term features. When compared to the conventional feature augmentation approach, substantial improvement is obtained. Moreover, an improved RDT training procedure in which speaker dependent transforms are take into account is proposed for feature combinination in the Speaker Adaptive Training. By incorporating the higher dimensional features output from the layer prior to the bottleneck layer into our Speech-to-Text (SIT) system using RDT, significant improvement is achieved as compared to using the conventional bottleneck features. In summary, by using the features derived from MLP with RDT, 8.2% to 11.4% relative reduction in Character Error Rate is achieved for our Mandarin STT systems.
引用
收藏
页码:228 / 231
页数:4
相关论文
共 50 条
  • [31] Fuzzy Hough transform and an MLP with fuzzy input output for character recognition
    Sural, S
    Das, PK
    [J]. FUZZY SETS AND SYSTEMS, 1999, 105 (03) : 489 - 497
  • [32] Modulation features for speech recognition
    Dimitriadis, D
    Maragos, P
    Potamianos, L
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 377 - 380
  • [33] Facial expression recognition based on wavelet transform and MLP neural network
    Lu, YZ
    Wei, ZY
    [J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1340 - 1343
  • [34] Use of Different Features for Emotion Recognition Using MLP Network
    Palo, H. K.
    Mohanty, Mihir Narayana
    Chandra, Mahesh
    [J]. COMPUTATIONAL VISION AND ROBOTICS, 2015, 332 : 7 - 15
  • [35] MLP internal representation as discriminative features for improved speaker recognition
    Wu, DL
    Morris, A
    Koreman, J
    [J]. NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 72 - 80
  • [36] Recognition of Handwritten English and Digits Using Stroke Features and MLP
    Chen, Chung-Hsing
    Huang, Zih-Hao
    Huang, Ko-Wei
    [J]. 2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
  • [37] Shifted-Delta MLP Features for Spoken Language Recognition
    Wang, Haipeng
    Leung, Cheung-Chi
    Lee, Tan
    Ma, Bin
    Li, Haizhou
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (01) : 15 - 18
  • [38] MLP refined posterior features for noise robust phoneme recognition
    Kazemi, A. R.
    Sobhanmanesh, F.
    [J]. SCIENTIA IRANICA, 2011, 18 (06) : 1443 - 1449
  • [39] Recognition of Fear from Speech using Adaptive Algorithm with MLP Classifier
    Ram, Rashmirekha
    Palo, Hemanta Kumar
    Mohanty, Mihir Narayan
    [J]. PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT 2016), 2016,
  • [40] Comparison between two hybrid HMM/MLP approaches in speech recognition
    Fontaine, V
    Ris, C
    Leich, H
    Vantieghem, J
    Accaino, S
    VanCompernolle, D
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3362 - 3365