The ATR multilingual speech-to-speech translation system

被引：62

作者：

Nakamura, S ^{[1
]}

Markov, K ^{[1
]}

Nakaiwa, H ^{[1
]}

Kikui, G ^{[1
]}

Kawai, H ^{[1
]}

Jitsuhiro, T ^{[1
]}

Zhang, JS ^{[1
]}

Yamamoto, H ^{[1
]}

Sumita, E ^{[1
]}

Yamamoto, S ^{[1
]}

机构：

[1] ATR Spoken Language Translat Res Labs, Kyoto 6190288, Japan

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 02期

关键词：

example-based machine translation (EBMT); minimum description length (MDL); multiclass language model; speech-to-speech translation (S2S); statistical machine translation (SMT); successive state splitting (SSS); text-to-speech (TTS) conversion;

D O I：

10.1109/TSA.2005.860774

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies developed at ATR. A corpus-based statistical machine learning framework forms the basis of our system design. We use a parallel multilingual database consisting of over 600 000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a Test of English for International Communication (TOEIC) score of 750 out of the perfect score of 990.

引用

页码：365 / 376

页数：12

共 50 条

[21] Towards Machine Speech-to-speech Translation
Satoshi, Nakamura
Sudoh, Katsuhito
Sakti, Sakriani
[J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2019, (17): : 81 - 87
[22] Prosody generation for speech-to-speech translation
Aguero, Pablo Daniel
Adell, Jordi
Bonafonte, Antonio
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 557 - 560
[23] The Asian Network-based Speech-to-Speech Translation System
Sakti, Sakriani
Kimura, Noriyuki
Paul, Michael
Hori, Chiori
Sumita, Eiichiro
Nakamura, Satoshi
Park, Jun
Wutiwiwatchai, Chai
Xu, Bo
Riza, Hammam
Arora, Karunesh
Luong, Chi Mai
Li, Haizhou
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
[24] VERBMOBIL: The evolution of a complex large speech-to-speech translation system
Bub, T
Schwinn, J
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2371 - 2374
[25] Contextual reasoning in speech-to-speech translation
Koch, S
Küssner, U
Stede, M
Tidhar, D
[J]. NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 283 - 292
[26] Language Identification for Speech-to-Speech Translation
Lim, Daniel Chung Yong
Lane, Ian
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 204 - 207
[27] IBM MASTOR: Multilingual automatic speech-to-speech translator
Gao, Yuqing
Zhou, Bowen
Gu, Liang
Sarikaya, Ruhi
Kuo, Hong-kwang
Rosti, A-V I.
Afify, Mohamed
Zhu, Weizhong
[J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 6063 - 6066
[28] Unsupervised features from text for speech synthesis in a speech-to-speech translation system
Watts, Oliver
Zhou, Bowen
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2164 - 2167
[29] Textless Unit-to-Unit Training for Many-to-Many Multilingual Speech-to-Speech Translation
Kim, Minsu
Choi, Jeongsoo
Kim, Dahun
Ro, Yong Man
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3934 - 3946
[30] Multilingual Speech to Speech Translation System in Bluetooth Environment
Ansari, M. D. Faizullah
Shaji, R. S.
SivaKarthick, T. J.
Vivek, S.
Aravind, A.
[J]. 2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 1055 - 1058

← 1 2 3 4 5 →