On the Jointly Unsupervised Feature Vector Normalization and Acoustic Model Compensation for Robust Speech Recognition

被引:0
|
作者
Buera, Luis [1 ]
Miguel, Antonio [1 ]
Lleida, Eduardo [1 ]
Saz, Oscar [1 ]
Ortega, Alfonso [1 ]
机构
[1] Univ Zaragoza, GTC, E-50009 Zaragoza, Spain
关键词
robust speech recognition; feature vector normalization; acoustic model adaptation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To compensate the mismatch between training and testing conditions, an unsupervised hybrid compensation technique is proposed. It combines Multi-Environment Model based LInear Normalization (MEMLIN) with a novel acoustic model adaptation method based on rotation transformations. A set of rotation transformations is estimated between clean and MEMLIN-normalized data by linear regression in a training process. Thus, each MEMLIN-normalized frame is decoded using the expanded acoustic models, which are obtained from the reference ones and the set of rotation transformations. During the search algorithm, one of the rotation transformations is on-line selected for each frame according to the ML criterion in a modified Viterbi algorithm. Some experiments with Spanish SpeechDat Car database were carried out. MEMLIN over standard ETSI front-end parameters reaches 75.53% of mean improvement in WER, while the introduced hybrid solution goes up to 90.54%.
引用
收藏
页码:1381 / 1384
页数:4
相关论文
共 50 条
  • [41] Cepstral Distance and Log Energy Based Silence Feature Normalization for Robust Speech Recognition
    Shen, Guanghu
    Chung, Hyun-Yeol
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2010, 29 (04): : 278 - 285
  • [42] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Das, Biswajit
    Kopparapu, Sunil Kumar
    [J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [43] SNR-normalization for robust speech recognition
    Claes, T
    VanCompernolle, D
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 331 - 334
  • [44] Unsupervised noise model estimation for model-based robust speech recognition
    Graciarena, M
    Franco, H
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 351 - 356
  • [45] Integrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis
    Hasan, Taufiq
    Hansen, John H. L.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1566 - 1569
  • [46] Unsupervised feature selection and NMF de-noising for robust Speech Emotion Recognition
    Bandela, Surekha Reddy
    Kumar, T. Kishore
    [J]. APPLIED ACOUSTICS, 2021, 172 (172)
  • [47] Speech emotion recognition with unsupervised feature learning
    Zheng-wei Huang
    Wen-tao Xue
    Qi-rong Mao
    [J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 358 - 366
  • [48] Speech emotion recognition with unsupervised feature learning
    Huang, Zheng-wei
    Xue, Wen-tao
    Mao, Qi-rong
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (05) : 358 - 366
  • [49] Speech emotion recognition with unsupervised feature learning
    Zheng-wei HUANG
    Wen-tao XUE
    Qi-rong MAO
    [J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16 (05) : 358 - 366
  • [50] Adaptive compensation for robust speech recognition
    Lee, CH
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 357 - 364