Feature Vector Normalization with Combined Standard and Throat Microphones for Robust ASR

被引:0
|
作者
Buera, Luis [1 ]
Miguel, Antonio [1 ]
Saz, Oscar [1 ]
Ortega, Alfonso [1 ]
Lleida, Eduardo [1 ]
机构
[1] Univ Zaragoza, Commun Technol Grp GTC, E-50009 Zaragoza, Spain
关键词
Throat microphone; robust speech recognition; feature vector normalization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose on-line unsupervised compensation technique for robust speech recognition that combines standard and throat microphone feature vectors. The solution, called Multi-Environment Model-based Linear Normalization with Throat microphone information, MEMLINT, is an extension of MEM-LIN formulation. Hence, standard microphone noisy space and throat microphone space arc modelled as GMMs and a set of linear transformations are learnt from data associated to each pair of Gaussians (one for each GMM) using training stereo data. On the other hand, to compensate some kinds of degradation which are not considered in MEMLINT, we propose to use jointly an on-line unsupervised acoustic model adaptation method based on rotation transformations over an expanded HMM-state space (augMented stAte space acousTic dEcoder, MATE). Some experiments with an own recorded database were carried out, showing that the proposed approach significantly outperforms the single microphone approach.
引用
收藏
页码:1289 / 1292
页数:4
相关论文
共 50 条
  • [1] Combining standard and throat microphones for robust speech recognition
    Graciarena, M
    Franco, H
    Sonmez, K
    Bratt, H
    IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (03) : 72 - 74
  • [2] A ROBUST FRONTEND FOR ASR: COMBINING DENOISING, NOISE MASKING AND FEATURE NORMALIZATION
    Van Segbroeck, Maarten
    Narayanan, Shrikanth S.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7097 - 7101
  • [3] Combined Use of Standard and Throat Microphones for Measurement of Acoustic Voice Parameters and Voice Categorization
    Uloza, Virgilijus
    Padervinskis, Evaldas
    Uloziene, Ingrida
    Saferis, Viktoras
    Verikas, Antanas
    JOURNAL OF VOICE, 2015, 29 (05) : 552 - 559
  • [4] An investigation of likelihood normalization for robust ASR
    Vincent, Emmanuel
    Gkiokas, Aggelos
    Schnitzer, Dominik
    Flexer, Arthur
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 621 - 625
  • [5] A recursive feature vector normalization approach for robust speech recognition in noise
    Viikki, O
    Bye, D
    Laurila, K
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 733 - 736
  • [6] Speech feature smoothing for robust ASR
    Chen, CP
    Bilmes, J
    Ellis, DPW
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 525 - 528
  • [7] Cepstral domain segmental feature vector normalization for noise robust speech recognition
    Viikki, O
    Laurila, K
    SPEECH COMMUNICATION, 1998, 25 (1-3) : 133 - 147
  • [8] Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones
    Sahidullah, Md.
    Thomsen, Dennis Alexander Lehmann
    Hautamaki, Rosa Gonzalez
    Kinnunen, Tomi
    Tan, Zheng-Hua
    Parts, Robert
    Pitkanen, Martti
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (01) : 44 - 56
  • [9] Spectral entropy based feature for robust ASR
    Misra, H
    Ikbal, S
    Bourlard, H
    Hermansky, H
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 193 - 196
  • [10] Robust ASR using support vector machines
    Solera-Urena, R.
    Martin-Iglesias, D.
    Gallardo-Antolin, A.
    Pelaez-Moreno, C.
    Diaz-de-Maria, F.
    SPEECH COMMUNICATION, 2007, 49 (04) : 253 - 267