Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds

被引:0
|
作者
Kohler, J
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The aim of this work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross-language HMM-based phoneme models have been used only for bootstrapping the language-dependent models and the multi-lingual approach has been investigated only on very small speech corpora. In this paper, we introduce a statistical distance measure to determine the similarities of sounds. Further, we present a new technique to model multi-lingual phonemes. The experiments are conducted with the OGI Multi-Language Telephone Speech Corpus for the languages American English, German and Spanish. In the first experiment phoneme recognition rates between 39.0% and 53.9% are achieved using language-dependent models. Using cross-language models yields for some phonemes improvement, but in average a degradation of recognition performance is observed. However, cross-language models speeds up the cross-language transfer and reduces the size of the phoneme inventory of multi-lingual speech recognition systems. Finally, a new method of modelling multi-lingual phonemes, which can be used for a variety of language, is presented. This technique reduces the number of phoneme-based units in a multi-lingual speech recognition system.
引用
收藏
页码:2195 / 2198
页数:4
相关论文
共 50 条
  • [41] Improved multi-lingual sentiment analysis and recognition using deep learning
    Khan, Amjad
    [J]. JOURNAL OF INFORMATION SCIENCE, 2023,
  • [42] Dataset and Evaluation of Automatic Speech Recognition for Multi-lingual Intent Recognition on Social Robots
    Andriella, Antonio
    Ros, Raquel
    Ellinson, Yoav
    Gannot, Sharon
    Lemaignan, Severin
    [J]. PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024, 2024, : 865 - 869
  • [43] An event-based acoustic-phonetic approach for speech segmentation and E-set recognition
    Juneja, A
    Deshmukh, O
    Espy-Wilson, C
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4164 - 4164
  • [44] EFFICIENT MULTI-LINGUAL UNSUPERVISED ACOUSTIC MODEL TRAINING UNDER MISMATCH CONDITIONS
    Saiko, Masahiro
    Yamamoto, Hitoshi
    Isotani, Ryosuke
    Hori, Chiori
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 24 - 29
  • [45] Cross corpus multi-lingual speech emotion recognition using ensemble learning
    Zehra, Wisha
    Javed, Abdul Rehman
    Jalil, Zunera
    Khan, Habib Ullah
    Gadekallu, Thippa Reddy
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2021, 7 (04) : 1845 - 1854
  • [46] Cross corpus multi-lingual speech emotion recognition using ensemble learning
    Wisha Zehra
    Abdul Rehman Javed
    Zunera Jalil
    Habib Ullah Khan
    Thippa Reddy Gadekallu
    [J]. Complex & Intelligent Systems, 2021, 7 : 1845 - 1854
  • [47] A Low Resource Multi-lingual Simultaneous Script Identification and Text Recognition Model
    Jayati Mukherjee
    Utpal Roy
    [J]. SN Computer Science, 5 (6)
  • [48] MLMSign: Multi-lingual multi-modal illumination-invariant sign language recognition
    Sadeghzadeh, Arezoo
    Shah, A. F. M. Shahen
    Islam, Md Baharul
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [49] An acoustic-phonetic analysis of large vocabulary continuous Mandarin speech recognition for non-native speakers
    Yang, J
    Pu, YY
    Wei, H
    Zhao, ZP
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 241 - 244
  • [50] Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech
    Likitsupin, Krerksak
    Punyabukkana, Proadpran
    Wutiwiwatchai, Chai
    Suchato, Atiwong
    [J]. ENGINEERING JOURNAL-THAILAND, 2016, 20 (02): : 179 - 197