ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

被引:0
|
作者
Boito, Marcely Zanon [1 ]
Ortega, John [2 ]
Riguidel, Hugo [2 ]
Laurent, Antoine [2 ]
Barrault, Loic [2 ]
Bougares, Fethi [3 ]
Chaabani, Firas [3 ]
Nguyen, Ha [1 ,5 ]
Barbier, Florentin [4 ]
Gahbiche, Souhir [4 ]
Esteve, Yannick [1 ]
机构
[1] Avignon Univ, LIA, Avignon, France
[2] Le Mans Univ, LIUM, Le Mans, France
[3] ELYADATA Tunis, Tunis, Tunisia
[4] Airbus France, Blagnac, France
[5] LIG Grenoble Alpes Univ, St Martin Dheres, France
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. For the Tunisian Arabic-English dataset (low-resource and dialect tracks), we build an end-to-end model as our joint primary submission, and compare it against cascaded models that leverage a large fine-tuned wav2vec 2.0 model for ASR. Our results show that in our settings pipeline approaches are still very competitive, and that with the use of transfer learning, they can outperform end-to-end models for speech translation (ST). For the TamasheqFrench dataset (low-resource track) our primary submission leverages intermediate representations from a wav2vec 2.0 model trained on 234 hours of Tamasheq audio, while our contrastive model uses a French phonetic transcription of the Tamasheq audio as input in a Conformer speech translation architecture jointly trained on automatic speech recognition, ST and machine translation losses. Our results highlight that self-supervised models trained on smaller sets of target data are more effective to low-resource end-to-end ST fine-tuning, compared to large off-the-shelf models. Results also illustrate that even approximate phonetic transcriptions can improve ST scores.
引用
收藏
页码:308 / 318
页数:11
相关论文
共 32 条
  • [1] ON-TRAC' systems for the IWSLT 2021 low-resource speech translation and multilingual speech translation shared tasks
    Lee, Hang
    Barbier, Florentin
    Ha Nguyen
    Tomanshenko, Natalia
    Mdhaffar, Salima
    Gahbiche, Souhir
    Bougares, Fethi
    Lecouteux, Benjamin
    Schwabe, Didier
    Esteve, Yannick
    [J]. IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 169 - 174
  • [2] ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
    Elbayad, Maha
    Ha Nguyen
    Bougares, Fethi
    Tomashenko, Natalia
    Caubriere, Antoine
    Lecouteux, Benjamin
    Esteve, Yannick
    Besacier, Laurent
    [J]. 17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 35 - 43
  • [3] IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
    Denisov, Pavel
    Mager, Manuel
    Ngoc Thang Vu
    [J]. IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 175 - 181
  • [4] CMU's IWSLT 2022 Dialect Speech Translation System
    Yan, Brian
    Fernandes, Patrick
    Dalmia, Siddharth
    Shi, Jiatong
    Peng, Yifan
    Berrebbi, Dan
    Wang, Xinyi
    Neubig, Graham
    Watanabe, Shinji
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 298 - 307
  • [5] JHU IWSLT 2022 Dialect Speech Translation System Description
    Yang, Jinyi
    Hussein, Amir
    Wiesner, Matthew
    Khudanpur, Sanjeev
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 319 - 326
  • [6] MLLP-VRAIN UPV systems for the IWSLT 2022 Simultaneous Speech Translation and Speech-to-Speech Translation tasks
    Iranzo-Sanchez, Javier
    Jorge, Javier
    Perez-Gonzalez-de-Marto, Alejandro
    Gimenez, Adria
    Garces Diaz-Munio, Goncal, V
    Baquero-Arnal, Pau
    Albert Silvestre-Cerda, Joan
    Civera, Jorge
    Sanchis, Albert
    Juan, Alfons
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 255 - 264
  • [7] Speech-to-speech Low-resource Translation
    Liu, Hsiao-Chuan
    Day, Min-Yuh
    Wang, Chih-Chien
    [J]. 2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 91 - 95
  • [8] NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2022
    Hrinchuk, Oleksii
    Noroozi, Vahid
    Khattar, Abhinav
    Peganov, Anton
    Subramanian, Sandeep
    Majumdar, Somshubra
    Kuchaiev, Oleksii
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 225 - 231
  • [9] The USTC-NELSLIP Offline Speech Translation Systems for IWSLT 2022
    Zhang, Weitai
    Ye, Zhongyi
    Tang, Haitao
    Li, Xiaoxi
    Zhou, Xinyuan
    Yang, Jing
    Cui, Jianwei
    Deng, Pan
    Shi, Mohan
    Song, Yifan
    Liu, Dan
    Liu, Junhua
    Dai, Lirong
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 198 - 207
  • [10] Low-Resource Speech-to-Text Translation
    Bansal, Sameer
    Kamper, Herman
    Livescu, Karen
    Lopez, Adam
    Goldwater, Sharon
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1298 - 1302