Low-Latency Neural Speech Translation

被引:23
|
作者
Niehues, Jan [1 ]
Ngoc-Quan Pham [1 ]
Thanh-Le Ha [1 ]
Sperber, Matthias [1 ]
Waibel, Alex [1 ]
机构
[1] KIT, Inst Anthropomat & Robot, Karlsruhe, Germany
关键词
speech translation; low-latency;
D O I
10.21437/Interspeech.2018-1055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Through the development of neural machine translation, the quality of machine translation systems has been improved significantly. By exploiting advancements in deep learning, systems are now able to better approximate the complex mapping from source sentences to target sentences. But with this ability, new challenges also arise. An example is the translation of partial sentences in low-latency speech translation. Since the model has only seen complete sentences in training, it will always try to generate a complete sentence, though the input may only be a partial sentence. We show that NMT systems can be adapted to scenarios where no task-specific training data is available. Furthermore, this is possible without losing performance on the original training data. We achieve this by creating artificial data and by using multi-task learning. After adaptation, we are able to reduce the number of corrections displayed during incremental output construction by 45%, without a decrease in translation quality.
引用
收藏
页码:1293 / 1297
页数:5
相关论文
共 50 条
  • [1] Dynamic Transcription for Low-latency Speech Translation
    Niehues, Jan
    Nguyen, Thai Son
    Cho, Eunah
    Ha, Thanh-Le
    Kilgour, Kevin
    Mueller, Markus
    Sperber, Matthias
    Stueker, Sebastian
    Waibel, Alex
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2513 - 2517
  • [2] Amortized Neural Networks for Low-Latency Speech Recognition
    Macoskey, Jonathan
    Strimel, Grant P.
    Su, Jinru
    Rastrow, Ariya
    [J]. INTERSPEECH 2021, 2021, : 4558 - 4562
  • [3] Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training ☆
    Zheng, Renjie
    Ma, Mingbo
    Zheng, Baigong
    Liu, Kaibo
    Yuan, Jiahong
    Church, Kenneth
    Huang, Liang
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3928 - 3937
  • [4] Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
    Liu, Danni
    Spanakis, Gerasimos
    Niehues, Jan
    [J]. INTERSPEECH 2020, 2020, : 3620 - 3624
  • [5] LOW-LATENCY DEEP CLUSTERING FOR SPEECH SEPARATION
    Wang, Shanshan
    Naithani, Gaurav
    Virtanen, Tuomas
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 76 - 80
  • [6] Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs
    Nurvitadhi, Eriko
    Boutros, Andrew
    Budhkar, Prerna
    Jafari, Ali
    Kwon, Dongup
    Sheffield, David
    Prabhakaran, Abirami
    Gururaj, Karthik
    Appana, Pranavi
    Naik, Mishali
    [J]. 2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 307 - 310
  • [7] EXPLORING TRADEOFFS IN MODELS FOR LOW-LATENCY SPEECH ENHANCEMENT
    Wilson, Kevin
    Chinen, Michael
    Thorpe, Jeremy
    Patton, Brian
    Hershey, John
    Saurous, Rif A.
    Skoglund, Jan
    Lyon, Richard F.
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 366 - 370
  • [8] LOW-LATENCY APPROXIMATION OF BIDIRECTIONAL RECURRENT NETWORKS FOR SPEECH DENOISING
    Wichern, Gordon
    Lukin, Alexey
    [J]. 2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 66 - 70
  • [9] Low-latency transformer model for streaming automatic speech recognition
    Miao, Haoran
    Cheng, Gaofeng
    Zhang, Pengyuan
    [J]. ELECTRONICS LETTERS, 2022, 58 (01) : 44 - 46
  • [10] LOW-LATENCY SPEECH SEPARATION GUIDED DIARIZATION FOR TELEPHONE CONVERSATIONS
    Morrone, Giovanni
    Cornell, Samuele
    Raj, Desh
    Serafini, Luca
    Zovato, Enrico
    Brutti, Alessio
    Squartini, Stefano
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 641 - 646