The ATR multilingual speech-to-speech translation system

被引：62

作者：

Nakamura, S ^{[1
]}

Markov, K ^{[1
]}

Nakaiwa, H ^{[1
]}

Kikui, G ^{[1
]}

Kawai, H ^{[1
]}

Jitsuhiro, T ^{[1
]}

Zhang, JS ^{[1
]}

Yamamoto, H ^{[1
]}

Sumita, E ^{[1
]}

Yamamoto, S ^{[1
]}

机构：

[1] ATR Spoken Language Translat Res Labs, Kyoto 6190288, Japan

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 02期

关键词：

example-based machine translation (EBMT); minimum description length (MDL); multiclass language model; speech-to-speech translation (S2S); statistical machine translation (SMT); successive state splitting (SSS); text-to-speech (TTS) conversion;

D O I：

10.1109/TSA.2005.860774

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies developed at ATR. A corpus-based statistical machine learning framework forms the basis of our system design. We use a parallel multilingual database consisting of over 600 000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a Test of English for International Communication (TOEIC) score of 750 out of the perfect score of 990.

引用

页码：365 / 376

页数：12

共 50 条

[1] Multilingual speech-to-speech translation system: VoiceTra
Matsuda, Shigeki
Hu, Xinhui
Shiga, Yoshinori
Kashioka, Hideki
Hori, Chiori
Yasuda, Keiji
Okuma, Hideo
Uchiyama, Masao
Sumita, Eiichiro
Kawai, Hisashi
Nakamura, Satoshi
[J]. 2013 IEEE 14TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2013), VOL 2, 2013, : 229 - 233
[2] Multilingual Speech-to-Speech Translation System for Mobile Consumer Devices
Yun, Seung
Lee, Young-Jik
Kim, Sang-Hun
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2014, 60 (03) : 508 - 516
[3] CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Jia, Ye
Ramanovich, Michelle Tadmor
Wang, Quan
Zen, Heiga
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6691 - 6703
[4] Multilingual Web Conferencing Using Speech-to-Speech Translation
Chen, John
Wen, Shufei
Sridhar, Vivek Kumar Rangarajan
Bangalore, Srinivas
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1860 - 1862
[5] Rhonda: the architecture of a multilingual speech-to-speech translation pipeline
Louw, Johannes A.
Moodley, Avashlin
[J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND INNOVATIVE COMPUTING APPLICATIONS (ICONIC), 2018, : 194 - 200
[6] Developing high performance ASR in the IBM multilingual speech-to-speech translation system
Cui, Xiaodong
Gu, Liang
Xiang, Bing
Zhang, Wei
Gao, Yuqing
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5121 - 5124
[7] NICT/ATR Chinese-Japanese-English Speech-to-Speech Translation System
Tohru Shimizu
Yutaka Ashikari
Eiichiro Sumita
张劲松
Satoshi Nakamura
[J]. Tsinghua Science and Technology, 2008, (04) : 540 - 544
[8] The NESPOLE! speech-to-speech translation system
Lavie, A
Levin, L
Frederking, R
Pianesi, F
[J]. MACHINE TRANSLATION: FROM RESEARCH TO REAL USERS, 2002, 2499 : 240 - 243
[9] Generating Arabic text in multilingual speech-to-speech machine translation framework
Monem, Azza Abdel
Shaalan, Khaled
Rafea, Ahmed
Baraka, Hoda
[J]. MACHINE TRANSLATION, 2008, 22 (04) : 205 - 258
[10] AN ANALYSIS OF MACHINE TRANSLATION AND SPEECH SYNTHESIS IN SPEECH-TO-SPEECH TRANSLATION SYSTEM
Hashimoto, Kei
Yamagishi, Junichi
Byrne, William
King, Simon
Tokuda, Keiichi
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5108 - 5111

← 1 2 3 4 5 →