MINTZAI: End-to-end Deep Learning for Speech Translation

被引:0
|
作者
Etchegoyhen, Thierry [1 ]
Arzelus, Haritz [1 ]
Gete, Harritxu [1 ]
Alvarez, Aitor [1 ]
Hernaez, Inma [2 ]
Navas, Eva [2 ]
Gonzalez-Docasal, Ander [1 ]
Osacar, Jaime [1 ]
Benites, Edson [1 ]
Ellakuria, Igor [3 ]
Calonge, Eusebi [4 ]
Martin, Maite [4 ]
机构
[1] Basque Res & Technol Alliance BRTA, Vicomtech Fdn, Mendaro, Gipuzkoa, Spain
[2] Univ Basque Country, HiTZ Ctr Aholab, UPV EHU, Leioa, Spain
[3] ISEA, Arrasate Mondragon, Gipuzkoa, Spain
[4] Ametzagaina, Donostia San Sebastian, Gipuzkoa, Spain
来源
关键词
Speech Translation; Machine Translation; Speech Recognition; Text to Speech; Deep Learning;
D O I
10.26342/2020-65-12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech Translation consists in translating speech in one language into text or speech in a different language. These systems have numerous applications, particularly in multilingual communities such as the European Union. The standard approach in the field involves the chaining of separate components for speech recognition, machine translation and speech synthesis. With the advances made possible by artificial neural networks and Deep Learning, training end-to-end speech translation systems has given rise to intense research and development activities in recent times. In this paper, we review the state of the art and describe project mintzai, which is being carried out in this field.
引用
收藏
页码:97 / 100
页数:4
相关论文
共 50 条
  • [1] Mutual-Learning Improves End-to-End Speech Translation
    Zhao, Jiawei
    Luo, Wei
    Chen, Boxing
    Gilman, Andrew
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3989 - 3994
  • [2] MULTILINGUAL END-TO-END SPEECH TRANSLATION
    Inaguma, Hirofumi
    Duh, Kevin
    Kawahara, Tatsuya
    Watanabe, Shinji
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
  • [3] End-to-End Speech Translation for Code Switched Speech
    Weller, Orion
    Sperber, Matthias
    Pires, Telmo
    Setiawan, Hendra
    Gollan, Christian
    Telaar, Dominic
    Paulik, Matthias
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1435 - 1448
  • [4] Arabic speech recognition using end-to-end deep learning
    Alsayadi, Hamzah A.
    Abdelhamid, Abdelaziz A.
    Hegazy, Islam
    Fayed, Zaki T.
    [J]. IET SIGNAL PROCESSING, 2021, 15 (08) : 521 - 534
  • [5] End-to-End Automatic Speech Recognition with Deep Mutual Learning
    Masumura, Ryo
    Ihori, Mana
    Takashima, Akihiko
    Tanaka, Tomohiro
    Ashihara, Takanori
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 632 - 637
  • [6] End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge
    Kimura, Naoki
    Su, Zixiong
    Saeki, Takaaki
    [J]. INTERSPEECH 2020, 2020, : 1025 - 1026
  • [7] End-to-End Speech Translation with Adversarial Training
    Li, Xuancai
    Chen, Kehai
    Zhao, Tiejun
    Yang, Muyun
    [J]. WORKSHOP ON AUTOMATIC SIMULTANEOUS TRANSLATION CHALLENGES, RECENT ADVANCES, AND FUTURE DIRECTIONS, 2020, : 10 - 14
  • [8] END-TO-END AUTOMATIC SPEECH TRANSLATION OF AUDIOBOOKS
    Berard, Alexandre
    Besacier, Laurent
    Kocabiyikoglu, Ali Can
    Pietquin, Olivier
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6224 - 6228
  • [9] End-to-End Speech Translation with Knowledge Distillation
    Liu, Yuchen
    Xiong, Hao
    Zhang, Jiajun
    He, Zhongjun
    Wu, Hua
    Wang, Haifeng
    Zong, Chengqing
    [J]. INTERSPEECH 2019, 2019, : 1128 - 1132
  • [10] FluentNet: End-to-End Detection of Stuttered Speech Disfluencies With Deep Learning
    Kourkounakis, Tedd
    Hajavi, Amirhossein
    Etemad, Ali
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2986 - 2999