Phone set generation based on acoustic and contextual analysis for multilingual speech recognition

被引：0

作者：

Huang, Chien-Lin ^{[1
]}

Wu, Chung-Hsien ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

multilingual speech recognition; confusion matrix; acoustic likelihood; hyperspace analog to language model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This study presents a novel approach to generating phone units generation for the recognition of multilingual speech. Acoustic and contextual analysis is performed to characterize multilingual phonetic units for phone set generation. A confusion matrix combining acoustic and contextual similarities between every two phonetic units is constructed for phonetic unit clustering. Acoustic likelihood and hyperspace analog to language (HAL) model are adopted for acoustic similarity and contextual similarity estimation of phone models, respectively. Experiments show that the generated phone set provides a compact and robust set that considers acoustic and contextual information for multilingual speech recognition.

引用

页码：1017 / +

页数：2

共 50 条

[31] Acoustic Analysis for Automatic Speech Recognition
O'Shaughnessy, Douglas
PROCEEDINGS OF THE IEEE, 2013, 101 (05) : 1038 - 1053
[32] Acoustic analysis and recognition of whispered speech
Itoh, T
Takeda, K
Itakura, F
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 429 - 432
[33] Feature generation based on maximum normalized acoustic likelihood for improved speech recognition
Li, X
Stern, RM
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 545 - 548
[34] Development and analysis of multilingual phone recognition systems using Indian languages
K. E. Manjunath
Dinesh Babu Jayagopi
K. Sreenivasa Rao
V. Ramasubramanian
International Journal of Speech Technology, 2019, 22 : 157 - 168
[35] Development and analysis of multilingual phone recognition systems using Indian languages
Manjunath, K. E.
Jayagopi, Dinesh Babu
Rao, K. Sreenivasa
Ramasubramanian, V.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) : 157 - 168
[36] Evolutionary feature selection for emotion recognition in multilingual speech analysis
Brester, Christina
Semenkin, Eugene
Kovalev, Igor
Zelenkov, Pavel
Sidorov, Maxim
2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 2406 - 2411
[37] Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems
Karafiat, Martin
Baskar, Murali Karthick
Watanabe, Shinji
Hori, Takaaki
Wiesner, Matthew
Cernocky, Jan Honza
INTERSPEECH 2019, 2019, : 2220 - 2224
[38] Effects of contextual cues on speech recognition in simulated electric-acoustic stimulation
Kong, Ying-Yee
Donaldson, Gail
Somarowthu, Ala
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (05): : 2846 - 2857
[39] DEALING WITH ACOUSTIC MISMATCH FOR TRAINING MULTILINGUAL SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
Mohan, Aanchan
Ghalehjegh, Sina Hamidi
Rose, Richard C.
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4893 - 4896
[40] Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition
Wang, Xiaoyun
Kato, Tsuneo
Yamamoto, Seiichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (04): : 857 - 864

← 1 2 3 4 5 →