Phone set generation based on acoustic and contextual analysis for multilingual speech recognition

被引：0

作者：

Huang, Chien-Lin ^{[1
]}

Wu, Chung-Hsien ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

multilingual speech recognition; confusion matrix; acoustic likelihood; hyperspace analog to language model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This study presents a novel approach to generating phone units generation for the recognition of multilingual speech. Acoustic and contextual analysis is performed to characterize multilingual phonetic units for phone set generation. A confusion matrix combining acoustic and contextual similarities between every two phonetic units is constructed for phonetic unit clustering. Acoustic likelihood and hyperspace analog to language (HAL) model are adopted for acoustic similarity and contextual similarity estimation of phone models, respectively. Experiments show that the generated phone set provides a compact and robust set that considers acoustic and contextual information for multilingual speech recognition.

引用

页码：1017 / +

页数：2

共 50 条

[1] Online Generation of Acoustic Models for Multilingual Speech Recognition
Raab, Martin
Aradilla, Guillermo
Gruhn, Rainer
Noeth, Elmar
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2979 - +
[2] Generation of phonetic units for mixed-language speech recognition based on acoustic and contextual analysis
Huang, Chien-Lin
Wu, Chung-Hsien
IEEE TRANSACTIONS ON COMPUTERS, 2007, 56 (09) : 1225 - 1233
[3] Multilingual phone recognition of spontaneous telephone speech
Corredor-Ardoy, C
Lamel, L
Adda-Decker, M
Gauvain, JL
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 413 - 416
[4] Extending an Acoustic Data-Driven Phone Set for Spontaneous Speech Recognition
Bang, Jeong-Uk
Choi, Mu-Yeol
Kim, Sang-Hun
Kwon, Oh-Wook
INTERSPEECH 2019, 2019, : 4405 - 4409
[5] Multilingual acoustic models for speech recognition and synthesis
Kunzmann, S
Fischer, V
Gonzalez, J
Emam, O
Günther, C
Janke, E
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 745 - 748
[6] Phonetic Confusion Analysis and Robust Phone Set Generation for Shanghai-Accented Mandarin Speech Recognition
Ding, Guo-Hong
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1129 - 1132
[7] MULTILINGUAL AND CROSSLINGUAL SPEECH RECOGNITION USING PHONOLOGICAL-VECTOR BASED PHONE EMBEDDINGS
Zhu, Chengrui
An, Keyu
Zheng, Huahuan
Ou, Zhijian
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1034 - 1041
[8] Acoustic Modeling with a Shared Phoneme Set for Multilingual Speech Recognition without Code-Switching
Hara, Shogo
Nishizaki, Hiromitsu
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1571 - 1574
[9] The use of acoustic contextual information in HMM-Based speech recognition
Choi, IJ
Lee, SY
IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (05) : 108 - 110
[10] Use of acoustic contextual information in HMM-based speech recognition
Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
IEEE Signal Process Lett, 5 (108-110):

← 1 2 3 4 5 →