Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

被引:4
|
作者
Das, Biswajit [1 ]
Mandal, Sandipan [1 ]
Mitra, Pabitra [1 ]
Basu, Anupam [1 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India
关键词
Aging speech recognition; Vocal tract length normalization (VTLN); Maximum likelihood linear transform (MLLT); Maximum likelihood linear regression (MLLR); Maximum a posteriori (MAP); Maximum mutual information estimation (MMIE); VOCAL-TRACT; EXPECTATION MAXIMIZATION; NORMALIZATION; AGE;
D O I
10.1016/j.patrec.2012.10.029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:335 / 343
页数:9
相关论文
共 50 条
  • [21] Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish
    Gimeno-Gomez, David
    Martinez-Hinarejos, Carlos-D.
    APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [22] Analysis on MAP and MLLR Based Speaker Adaptation Techniques in Speech Recognition
    Ramya, T.
    Christina, Lilly S.
    Vijayalakshmi, P.
    Nagarajan, T.
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1753 - 1758
  • [23] Adaptation of precision matrix models on large vocabulary continuous speech recognition
    Sim, KC
    Gales, MJF
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 97 - 100
  • [24] SPEAKER ADAPTATION IN SPEECH RECOGNITION USING LINEAR-REGRESSION TECHNIQUES
    COX, S
    ELECTRONICS LETTERS, 1992, 28 (22) : 2093 - 2094
  • [25] SPEAKER ADAPTATION BY VARIABLE REFERENCE MODEL SUBSPACE AND APPLICATION TO LARGE VOCABULARY SPEECH RECOGNITION
    Teng, Wen Xuan
    Gravier, Guillaume
    Bimbot, Frederic
    Soufflet, Frederic
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4381 - 4384
  • [26] Improved discriminative training techniques for large vocabulary continuous speech recognition
    Povey, D
    Woodland, PC
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 45 - 48
  • [27] SPEAKER ADAPTATION IN A LIMITED SPEECH RECOGNITION SYSTEM
    MAKHOUL, J
    IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) : 1057 - &
  • [28] Quick fMLLR for speaker adaptation in speech recognition
    Varadarajan, Balakrishnan
    Povey, Daniel
    Chu, Stephen M.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4297 - +
  • [29] Speaker Adaptation on Myanmar Spontaneous Speech Recognition
    Naing, Hay Mar Soe
    Pa, Win Pa
    COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 303 - 313
  • [30] XMLLR for Improved Speaker Adaptation in Speech Recognition
    Povey, Daniel
    Kuo, Hong-Kwang J.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1705 - +