The ATR multilingual speech-to-speech translation system

被引:62
|
作者
Nakamura, S [1 ]
Markov, K [1 ]
Nakaiwa, H [1 ]
Kikui, G [1 ]
Kawai, H [1 ]
Jitsuhiro, T [1 ]
Zhang, JS [1 ]
Yamamoto, H [1 ]
Sumita, E [1 ]
Yamamoto, S [1 ]
机构
[1] ATR Spoken Language Translat Res Labs, Kyoto 6190288, Japan
关键词
example-based machine translation (EBMT); minimum description length (MDL); multiclass language model; speech-to-speech translation (S2S); statistical machine translation (SMT); successive state splitting (SSS); text-to-speech (TTS) conversion;
D O I
10.1109/TSA.2005.860774
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies developed at ATR. A corpus-based statistical machine learning framework forms the basis of our system design. We use a parallel multilingual database consisting of over 600 000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a Test of English for International Communication (TOEIC) score of 750 out of the perfect score of 990.
引用
收藏
页码:365 / 376
页数:12
相关论文
共 50 条
  • [21] Towards Machine Speech-to-speech Translation
    Satoshi, Nakamura
    Sudoh, Katsuhito
    Sakti, Sakriani
    [J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2019, (17): : 81 - 87
  • [22] Prosody generation for speech-to-speech translation
    Aguero, Pablo Daniel
    Adell, Jordi
    Bonafonte, Antonio
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 557 - 560
  • [23] The Asian Network-based Speech-to-Speech Translation System
    Sakti, Sakriani
    Kimura, Noriyuki
    Paul, Michael
    Hori, Chiori
    Sumita, Eiichiro
    Nakamura, Satoshi
    Park, Jun
    Wutiwiwatchai, Chai
    Xu, Bo
    Riza, Hammam
    Arora, Karunesh
    Luong, Chi Mai
    Li, Haizhou
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
  • [24] VERBMOBIL: The evolution of a complex large speech-to-speech translation system
    Bub, T
    Schwinn, J
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2371 - 2374
  • [25] Contextual reasoning in speech-to-speech translation
    Koch, S
    Küssner, U
    Stede, M
    Tidhar, D
    [J]. NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 283 - 292
  • [26] Language Identification for Speech-to-Speech Translation
    Lim, Daniel Chung Yong
    Lane, Ian
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 204 - 207
  • [27] IBM MASTOR: Multilingual automatic speech-to-speech translator
    Gao, Yuqing
    Zhou, Bowen
    Gu, Liang
    Sarikaya, Ruhi
    Kuo, Hong-kwang
    Rosti, A-V I.
    Afify, Mohamed
    Zhu, Weizhong
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 6063 - 6066
  • [28] Unsupervised features from text for speech synthesis in a speech-to-speech translation system
    Watts, Oliver
    Zhou, Bowen
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2164 - 2167
  • [29] Textless Unit-to-Unit Training for Many-to-Many Multilingual Speech-to-Speech Translation
    Kim, Minsu
    Choi, Jeongsoo
    Kim, Dahun
    Ro, Yong Man
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3934 - 3946
  • [30] Multilingual Speech to Speech Translation System in Bluetooth Environment
    Ansari, M. D. Faizullah
    Shaji, R. S.
    SivaKarthick, T. J.
    Vivek, S.
    Aravind, A.
    [J]. 2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 1055 - 1058