The impact of ASR on speech-to-speech translation performance

被引:0
|
作者
Sarikaya, Ruhi [1 ]
Zhou, Bowen [1 ]
Povey, Daniel [1 ]
Afify, Mohamed [1 ]
Gao, Yuqing [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
speech recognition; ASR; machine translation; MT; performance metric;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper reports on experiments to quantify the impact of Automatic Speech Recognition (ASR) in general and discriminatively trained ASR in particular on the Machine Translation (MT) performance. The Minimum Phone Error (MPE) training method is employed for building the discriminative ASR acoustic models and a Weighted Finite State Transducer (WFST) based method is used for MT. The experiments are performed on a two-way English/Dialectal-Arabic speech-to-speech (S2S) translation task in the military/medical domain. We demonstrate the relationship between ASR and MT performance measured by BLEU and human judgment for both directions of the translation. Moreover, we question the use of BLEU metric for assessing the MT quality, present our observations and draw some conclusions.
引用
收藏
页码:1289 / +
页数:2
相关论文
共 50 条
  • [1] Developing high performance ASR in the IBM multilingual speech-to-speech translation system
    Cui, Xiaodong
    Gu, Liang
    Xiang, Bing
    Zhang, Wei
    Gao, Yuqing
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5121 - 5124
  • [2] Recent improvements and performance analysis of ASR and MT in a speech-to-speech translation system
    Stallard, David
    Kao, Chia-lin
    Krstovski, Kriste
    Liu, Daben
    Natarajan, Prem
    Prasad, Rohit
    Saleem, Shirin
    Subramanian, Krishna
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4973 - 4976
  • [3] Impacts of machine translation and speech synthesis on speech-to-speech translation
    Hashimoto, Kei
    Yamagishi, Junichi
    Byrne, William
    King, Simon
    Tokuda, Keiichi
    [J]. SPEECH COMMUNICATION, 2012, 54 (07) : 857 - 866
  • [4] The NESPOLE! speech-to-speech translation system
    Lavie, A
    Levin, L
    Frederking, R
    Pianesi, F
    [J]. MACHINE TRANSLATION: FROM RESEARCH TO REAL USERS, 2002, 2499 : 240 - 243
  • [5] Hierarchical Classification for Speech-to-Speech Translation
    Ettelaie, Emil
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2534 - 2537
  • [6] SIMULTANEOUS SPEECH-TO-SPEECH TRANSLATION SYSTEM WITH TRANSFORMER-BASED INCREMENTAL ASR, MT, AND TTS
    Fukuda, Ryo
    Novitasari, Sashi
    Oka, Yui
    Kano, Yasumasa
    Yano, Yuki
    Ko, Yuka
    Tokuyama, Hirotaka
    Doi, Kosuke
    Yanagita, Tomoya
    Sakti, Sakriani
    Sudoh, Katsuhito
    Nakamura, Satoshi
    [J]. 2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 186 - 192
  • [7] Towards Machine Speech-to-speech Translation
    Satoshi, Nakamura
    Sudoh, Katsuhito
    Sakti, Sakriani
    [J]. TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2019, (17): : 81 - 87
  • [8] Prosody generation for speech-to-speech translation
    Aguero, Pablo Daniel
    Adell, Jordi
    Bonafonte, Antonio
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 557 - 560
  • [9] Contextual reasoning in speech-to-speech translation
    Koch, S
    Küssner, U
    Stede, M
    Tidhar, D
    [J]. NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 283 - 292
  • [10] Language Identification for Speech-to-Speech Translation
    Lim, Daniel Chung Yong
    Lane, Ian
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 204 - 207