Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

被引：4

作者：

Das, Biswajit ^{[1
]}

Mandal, Sandipan ^{[1
]}

Mitra, Pabitra ^{[1
]}

Basu, Anupam ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Comp Sci & Engn, Kharagpur 721302, W Bengal, India

来源：

PATTERN RECOGNITION LETTERS | 2013年 / 34卷 / 03期

关键词：

Aging speech recognition; Vocal tract length normalization (VTLN); Maximum likelihood linear transform (MLLT); Maximum likelihood linear regression (MLLR); Maximum a posteriori (MAP); Maximum mutual information estimation (MMIE); VOCAL-TRACT; EXPECTATION MAXIMIZATION; NORMALIZATION; AGE;

D O I：

10.1016/j.patrec.2012.10.029

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study. (C) 2012 Elsevier B.V. All rights reserved.

引用

页码：335 / 343

页数：9

共 50 条

[21] Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish
Gimeno-Gomez, David
Martinez-Hinarejos, Carlos-D.
APPLIED SCIENCES-BASEL, 2023, 13 (11):
[22] Analysis on MAP and MLLR Based Speaker Adaptation Techniques in Speech Recognition
Ramya, T.
Christina, Lilly S.
Vijayalakshmi, P.
Nagarajan, T.
2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1753 - 1758
[23] Adaptation of precision matrix models on large vocabulary continuous speech recognition
Sim, KC
Gales, MJF
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 97 - 100
[24] SPEAKER ADAPTATION IN SPEECH RECOGNITION USING LINEAR-REGRESSION TECHNIQUES
COX, S
ELECTRONICS LETTERS, 1992, 28 (22) : 2093 - 2094
[25] SPEAKER ADAPTATION BY VARIABLE REFERENCE MODEL SUBSPACE AND APPLICATION TO LARGE VOCABULARY SPEECH RECOGNITION
Teng, Wen Xuan
Gravier, Guillaume
Bimbot, Frederic
Soufflet, Frederic
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4381 - 4384
[26] Improved discriminative training techniques for large vocabulary continuous speech recognition
Povey, D
Woodland, PC
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 45 - 48
[27] SPEAKER ADAPTATION IN A LIMITED SPEECH RECOGNITION SYSTEM
MAKHOUL, J
IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) : 1057 - &
[28] Quick fMLLR for speaker adaptation in speech recognition
Varadarajan, Balakrishnan
Povey, Daniel
Chu, Stephen M.
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4297 - +
[29] Speaker Adaptation on Myanmar Spontaneous Speech Recognition
Naing, Hay Mar Soe
Pa, Win Pa
COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 303 - 313
[30] XMLLR for Improved Speaker Adaptation in Speech Recognition
Povey, Daniel
Kuo, Hong-Kwang J.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1705 - +

← 1 2 3 4 5 →