Phone set generation based on acoustic and contextual analysis for multilingual speech recognition

被引：0

作者：

Huang, Chien-Lin ^{[1
]}

Wu, Chung-Hsien ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

multilingual speech recognition; confusion matrix; acoustic likelihood; hyperspace analog to language model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This study presents a novel approach to generating phone units generation for the recognition of multilingual speech. Acoustic and contextual analysis is performed to characterize multilingual phonetic units for phone set generation. A confusion matrix combining acoustic and contextual similarities between every two phonetic units is constructed for phonetic unit clustering. Acoustic likelihood and hyperspace analog to language (HAL) model are adopted for acoustic similarity and contextual similarity estimation of phone models, respectively. Experiments show that the generated phone set provides a compact and robust set that considers acoustic and contextual information for multilingual speech recognition.

引用

页码：1017 / +

页数：2

共 50 条

[41] PHONE SET CONSTRUCTION BASED ON CONTEXT-SENSITIVE ARTICULATORY ATTRIBUTES FOR CODE-SWITCHING SPEECH RECOGNITION
Wu, Chung-Hsien
Shen, Han-Ping
Yang, Yan-Ting
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4865 - 4868
[42] PHONE SET CONSTRUCTION BASED ON CONTEXT-SENSITIVE ARTICULATORY ATTRIBUTES FOR CODE-SWITCHING SPEECH RECOGNITION
Wu, Chung-Hsien
Shen, Han-Ping
Yang, Yan-Ting
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4865 - 4868
[43] A novel decomposition-based architecture for multilingual speech emotion recognition
Ravi
Taran, Sachin
NEURAL COMPUTING & APPLICATIONS, 2024, : 9347 - 9359
[44] Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model
Li, Xingfeng
Akagi, Masato
SPEECH COMMUNICATION, 2019, 110 : 1 - 12
[45] Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Farooq, Muhammad Umar
Hain, Thomas
INTERSPEECH 2022, 2022, : 3849 - 3853
[46] Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling
G. Bouselmi
D. Fohr
I. Illina
International Journal of Speech Technology, 2012, 15 (2) : 203 - 213
[47] Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling
Bouselmi, G.
Fohr, D.
Illina, I.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 203 - 213
[48] An event-based acoustic-phonetic approach for speech segmentation and E-set recognition
Juneja, A
Deshmukh, O
Espy-Wilson, C
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4164 - 4164
[49] Speech recognition based on unified model of acoustic and language aspects of speech
1600, Nippon Telegraph and Telephone Corp. (11):
[50] Speech emotion recognition based on rough set and SVM
Zhou, Jian
Wang, Guoyin
Yang, Yong
Chen, Peijun
PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 53 - 61

← 1 2 3 4 5 →