ASSESSING EVALUATION METRICS FOR SPEECH-TO-SPEECH TRANSLATION

被引:4
|
作者
Salesky, Elizabeth [1 ]
Maeder, Julian [2 ]
Klinger, Severin [2 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Swiss Fed Inst Technol, Zurich, Switzerland
关键词
evaluation; speech synthesis; speech translation; speech-to-speech; dialects;
D O I
10.1109/ASRU51503.2021.9688073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech-to-speech translation combines machine translation with speech synthesis, introducing evaluation challenges not present in either task alone. How to automatically evaluate speech-to-speech translation is an open question which has not previously been explored. Translating to speech rather than to text is often motivated by unwritten languages or languages without standardized orthographies. However, we show that the previously used automatic metric for this task is best equipped for standardized high-resource languages only. In this work, we first evaluate current metrics for speech-to-speech translation, and second assess how translation to dialectal variants rather than to standardized languages impacts various evaluation methods.
引用
收藏
页码:733 / 740
页数:8
相关论文
共 50 条
  • [41] Speech-to-speech translation based on finite-state transducers
    Casacuberta, F
    Llorens, D
    Martínez, C
    Molau, S
    Nevado, F
    Ney, H
    Pastor, M
    Picó, D
    Sanchis, A
    Vidal, E
    Vilar, JM
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 613 - 616
  • [42] Unsupervised training for Farsi-English speech-to-speech translation
    Xiang, Bing
    Deng, Yonggang
    Gao, Yuqing
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4977 - 4980
  • [43] Preserving Word-Level Emphasis in Speech-to-Speech Translation
    Quoc Truong Do
    Toda, Tomoki
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 544 - 556
  • [44] EVALUATING DIFFERENT CONFIRMATION STRATEGIES FOR SPEECH-TO-SPEECH TRANSLATION SYSTEMS
    Stallard, David
    Prasad, Rohit
    Ananthakrishnan, Shankar
    Choi, Fred
    Saleem, Shirin
    Natarajan, Prem
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5218 - 5221
  • [45] The Asian Network-based Speech-to-Speech Translation System
    Sakti, Sakriani
    Kimura, Noriyuki
    Paul, Michael
    Hori, Chiori
    Sumita, Eiichiro
    Nakamura, Satoshi
    Park, Jun
    Wutiwiwatchai, Chai
    Xu, Bo
    Riza, Hammam
    Arora, Karunesh
    Luong, Chi Mai
    Li, Haizhou
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
  • [46] Speech-to-Speech Translation Humanoid Robot in Doctor's Office
    Shin, Sangmi
    Matson, Eric T.
    Park, Jinok
    Yang, Bowon
    Lee, Juhee
    Jung, Jin-Woo
    [J]. PROCEEDINGS OF THE 2015 6TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA), 2015, : 484 - 489
  • [47] JANUS-III: Speech-to-speech translation in multiple languages
    Lavie, A
    Waibel, A
    Levin, L
    Finke, M
    Gates, D
    Gavalda, M
    Zeppenfeld, T
    Zhan, PM
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 99 - 102
  • [48] VERBMOBIL: The evolution of a complex large speech-to-speech translation system
    Bub, T
    Schwinn, J
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2371 - 2374
  • [49] TRANSFORMER-BASED DIRECT SPEECH-TO-SPEECH TRANSLATION WITH TRANSCODER
    Kano, Takatomo
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 958 - 965
  • [50] System description: A highly interactive speech-to-speech translation system
    Dillinger, M
    Seligman, M
    [J]. MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS, 2004, 3265 : 58 - 63