The impact of ASR on speech-to-speech translation performance

被引：0

作者：

Sarikaya, Ruhi ^{[1
]}

Zhou, Bowen ^{[1
]}

Povey, Daniel ^{[1
]}

Afify, Mohamed ^{[1
]}

Gao, Yuqing ^{[1
]}

机构：

[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

speech recognition; ASR; machine translation; MT; performance metric;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper reports on experiments to quantify the impact of Automatic Speech Recognition (ASR) in general and discriminatively trained ASR in particular on the Machine Translation (MT) performance. The Minimum Phone Error (MPE) training method is employed for building the discriminative ASR acoustic models and a Weighted Finite State Transducer (WFST) based method is used for MT. The experiments are performed on a two-way English/Dialectal-Arabic speech-to-speech (S2S) translation task in the military/medical domain. We demonstrate the relationship between ASR and MT performance measured by BLEU and human judgment for both directions of the translation. Moreover, we question the use of BLEU metric for assessing the MT quality, present our observations and draw some conclusions.

引用

页码：1289 / +

页数：2

共 50 条

[1] Developing high performance ASR in the IBM multilingual speech-to-speech translation system
Cui, Xiaodong
Gu, Liang
Xiang, Bing
Zhang, Wei
Gao, Yuqing
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5121 - 5124
[2] Recent improvements and performance analysis of ASR and MT in a speech-to-speech translation system
Stallard, David
Kao, Chia-lin
Krstovski, Kriste
Liu, Daben
Natarajan, Prem
Prasad, Rohit
Saleem, Shirin
Subramanian, Krishna
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4973 - 4976
[3] Impacts of machine translation and speech synthesis on speech-to-speech translation
Hashimoto, Kei
Yamagishi, Junichi
Byrne, William
King, Simon
Tokuda, Keiichi
[J]. SPEECH COMMUNICATION, 2012, 54 (07) : 857 - 866
[4] The NESPOLE! speech-to-speech translation system
Lavie, A
Levin, L
Frederking, R
Pianesi, F
[J]. MACHINE TRANSLATION: FROM RESEARCH TO REAL USERS, 2002, 2499 : 240 - 243
[5] Hierarchical Classification for Speech-to-Speech Translation
Ettelaie, Emil
Georgiou, Panayiotis G.
Narayanan, Shrikanth S.
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2534 - 2537
[6] SIMULTANEOUS SPEECH-TO-SPEECH TRANSLATION SYSTEM WITH TRANSFORMER-BASED INCREMENTAL ASR, MT, AND TTS
Fukuda, Ryo
Novitasari, Sashi
Oka, Yui
Kano, Yasumasa
Yano, Yuki
Ko, Yuka
Tokuyama, Hirotaka
Doi, Kosuke
Yanagita, Tomoya
Sakti, Sakriani
Sudoh, Katsuhito
Nakamura, Satoshi
[J]. 2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 186 - 192
[7] Towards Machine Speech-to-speech Translation
Satoshi, Nakamura
Sudoh, Katsuhito
Sakti, Sakriani
[J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2019, (17): : 81 - 87
[8] Prosody generation for speech-to-speech translation
Aguero, Pablo Daniel
Adell, Jordi
Bonafonte, Antonio
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 557 - 560
[9] Contextual reasoning in speech-to-speech translation
Koch, S
Küssner, U
Stede, M
Tidhar, D
[J]. NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 283 - 292
[10] Language Identification for Speech-to-Speech Translation
Lim, Daniel Chung Yong
Lane, Ian
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 204 - 207

← 1 2 3 4 5 →