Multilingual phone recognition of spontaneous telephone speech

被引:0
|
作者
Corredor-Ardoy, C [1 ]
Lamel, L [1 ]
Adda-Decker, M [1 ]
Gauvain, JL [1 ]
机构
[1] BOUYGUES TELECOM, F-78944 Velizy, France
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we report on experiments with phone recognition of spontaneous telephone speech. Phone recognizers were trained and assessed on IDEAL, a multilingual corpus containing telephone speech in French, British English, German and Castillan Spanish. We investigated the influence of the training material composition (size and linguistic content) on the recognition performance using context-independent Hidden Markov Models and phonotactic bi-gram models. We found that when testing on spontaneous speech data, using only spontaneous speech training data gave the highest phone accuracies for the four languages, even though this data comprises only 14% of the available training data. The use of context-dependent HMMs reduced the phone error across the 4 languages, with the average error reduced to 51.9% from the 57.4% obtained with CZ models. We suggest a straightforward way of detecting non speech phenomena. The basic idea is to remove sequences of consonants between two silence labels from the recognized phone strings prior to scoring. This simple technique reduces the relative average phone error rate by 5.4%. The lowest phone error with CD models and filtering was obtained for Spanish (39.1%) with 4 language average being 49.1%.
引用
收藏
页码:413 / 416
页数:4
相关论文
共 50 条
  • [21] Improving Language Recognition with Multilingual Phone Recognition and Speaker Adaptation Transforms
    Stolcke, Andreas
    Akbacak, Murat
    Ferrer, Luciana
    Kajarekar, Sachin
    Richey, Colleen
    Scheffer, Nicolas
    Shriberg, Elizabeth
    ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 256 - 262
  • [22] Multilingual and multimode phone recognition system for Indian languages
    Tripathi, Kumud
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    SPEECH COMMUNICATION, 2020, 119 : 12 - 23
  • [23] Telephone speech recognition applications at IRST
    Falavigna, D
    Gretter, R
    1998 IEEE 4TH WORKSHOP INTERACTIVE VOICE TECHNOLOGY FOR TELECOMMUNICATIONS APPLICATIONS - IVTTA '98, 1998, : 27 - 30
  • [24] Improvements in recognition of conversational telephone speech
    Peskin, B
    Newman, M
    McAllaster, D
    Nagesha, V
    Richards, H
    Wegmann, S
    Hunt, M
    Gillick, L
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 53 - 56
  • [25] Robust speech recognition in telephone network
    Han, MS
    Park, GB
    Park, JG
    Han, JQ
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1103 - 1106
  • [26] Conversational telephone speech recognition for Lithuanian
    Lileiyte, Rasa
    Lamel, Lori
    Guvain, Jean-Luc
    Gorin, Arseniy
    COMPUTER SPEECH AND LANGUAGE, 2018, 49 : 71 - 82
  • [27] Speech Activity Detection Based on Multilingual Speech Recognition System
    Sarfjoo, Seyyed Saeed
    Madikeri, Srikanth
    Motlicek, Petr
    INTERSPEECH 2021, 2021, : 4369 - 4373
  • [28] The Multilingual TEDx Corpus for Speech Recognition and Translation
    Salesky, Elizabeth
    Wiesner, Matthew
    Bremerman, Jacob
    Cattoni, Roldano
    Negri, Matteo
    Turchi, Marco
    Oard, Douglas W.
    Post, Matt
    INTERSPEECH 2021, 2021, : 3655 - 3659
  • [29] Emotion Recognition from Speech Signal in Multilingual
    Albu, Corina
    Lupu, Eugen
    Arsinte, Radu
    6TH INTERNATIONAL CONFERENCE ON ADVANCEMENTS OF MEDICINE AND HEALTH CARE THROUGH TECHNOLOGY, MEDITECH 2018, 2019, 71 : 157 - 161
  • [30] MIXTURE OF INFORMED EXPERTS FOR MULTILINGUAL SPEECH RECOGNITION
    Gaur, Neeraj
    Farris, Brian
    Haghani, Parisa
    Leal, Isabel
    Moreno, Pedro J.
    Prasad, Manasa
    Ramabhadran, Bhuvana
    Zhu, Yun
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6234 - 6238