Phone set generation based on acoustic and contextual analysis for multilingual speech recognition

被引:0
|
作者
Huang, Chien-Lin [1 ]
Wu, Chung-Hsien [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
关键词
multilingual speech recognition; confusion matrix; acoustic likelihood; hyperspace analog to language model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study presents a novel approach to generating phone units generation for the recognition of multilingual speech. Acoustic and contextual analysis is performed to characterize multilingual phonetic units for phone set generation. A confusion matrix combining acoustic and contextual similarities between every two phonetic units is constructed for phonetic unit clustering. Acoustic likelihood and hyperspace analog to language (HAL) model are adopted for acoustic similarity and contextual similarity estimation of phone models, respectively. Experiments show that the generated phone set provides a compact and robust set that considers acoustic and contextual information for multilingual speech recognition.
引用
收藏
页码:1017 / +
页数:2
相关论文
共 50 条
  • [1] Online Generation of Acoustic Models for Multilingual Speech Recognition
    Raab, Martin
    Aradilla, Guillermo
    Gruhn, Rainer
    Noeth, Elmar
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2979 - +
  • [2] Generation of phonetic units for mixed-language speech recognition based on acoustic and contextual analysis
    Huang, Chien-Lin
    Wu, Chung-Hsien
    IEEE TRANSACTIONS ON COMPUTERS, 2007, 56 (09) : 1225 - 1233
  • [3] Multilingual phone recognition of spontaneous telephone speech
    Corredor-Ardoy, C
    Lamel, L
    Adda-Decker, M
    Gauvain, JL
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 413 - 416
  • [4] Extending an Acoustic Data-Driven Phone Set for Spontaneous Speech Recognition
    Bang, Jeong-Uk
    Choi, Mu-Yeol
    Kim, Sang-Hun
    Kwon, Oh-Wook
    INTERSPEECH 2019, 2019, : 4405 - 4409
  • [5] Multilingual acoustic models for speech recognition and synthesis
    Kunzmann, S
    Fischer, V
    Gonzalez, J
    Emam, O
    Günther, C
    Janke, E
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 745 - 748
  • [6] Phonetic Confusion Analysis and Robust Phone Set Generation for Shanghai-Accented Mandarin Speech Recognition
    Ding, Guo-Hong
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1129 - 1132
  • [7] MULTILINGUAL AND CROSSLINGUAL SPEECH RECOGNITION USING PHONOLOGICAL-VECTOR BASED PHONE EMBEDDINGS
    Zhu, Chengrui
    An, Keyu
    Zheng, Huahuan
    Ou, Zhijian
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1034 - 1041
  • [8] Acoustic Modeling with a Shared Phoneme Set for Multilingual Speech Recognition without Code-Switching
    Hara, Shogo
    Nishizaki, Hiromitsu
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1571 - 1574
  • [9] The use of acoustic contextual information in HMM-Based speech recognition
    Choi, IJ
    Lee, SY
    IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (05) : 108 - 110
  • [10] Use of acoustic contextual information in HMM-based speech recognition
    Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
    IEEE Signal Process Lett, 5 (108-110):