Dynamic Transcription for Low-latency Speech Translation

被引:20
|
作者
Niehues, Jan [1 ]
Nguyen, Thai Son [1 ]
Cho, Eunah [1 ]
Ha, Thanh-Le [1 ]
Kilgour, Kevin [1 ]
Mueller, Markus [1 ]
Sperber, Matthias [1 ]
Stueker, Sebastian [1 ]
Waibel, Alex [1 ]
机构
[1] Karlsruhe Inst Technol, Karlsruhe, Germany
基金
欧盟地平线“2020”;
关键词
speech translation; low-latency;
D O I
10.21437/Interspeech.2016-154
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Latency is one of the main challenges in the task of simultaneous spoken language translation. While significant improvements in recent years have led to high quality automatic translations, their usefulness in real-time settings is still severely limited due to the large delay between the input speech and the delivered translation. In this paper, we present a novel scheme which reduces the latency of a large scale speech translation system drastically. Within this scheme, the transcribed text and its translation can be updated when more context is available, even after they are presented to the user. Thereby, this scheme allows us to display an initial transcript and its translation to the user with a very low latency. If necessary, both transcript and translation can later be updated to better, more accurate versions until eventually the final versions are displayed. Using this framework, we are able to reduce the latency of the source language transcript into half. For the translation, an average delay of 3.3s was achieved, which is more than twice as fast as our initial system.
引用
收藏
页码:2513 / 2517
页数:5
相关论文
共 50 条
  • [1] Low-Latency Neural Speech Translation
    Niehues, Jan
    Ngoc-Quan Pham
    Thanh-Le Ha
    Sperber, Matthias
    Waibel, Alex
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1293 - 1297
  • [2] Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training ☆
    Zheng, Renjie
    Ma, Mingbo
    Zheng, Baigong
    Liu, Kaibo
    Yuan, Jiahong
    Church, Kenneth
    Huang, Liang
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3928 - 3937
  • [3] Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
    Liu, Danni
    Spanakis, Gerasimos
    Niehues, Jan
    [J]. INTERSPEECH 2020, 2020, : 3620 - 3624
  • [4] LOW-LATENCY DEEP CLUSTERING FOR SPEECH SEPARATION
    Wang, Shanshan
    Naithani, Gaurav
    Virtanen, Tuomas
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 76 - 80
  • [5] Amortized Neural Networks for Low-Latency Speech Recognition
    Macoskey, Jonathan
    Strimel, Grant P.
    Su, Jinru
    Rastrow, Ariya
    [J]. INTERSPEECH 2021, 2021, : 4558 - 4562
  • [6] EXPLORING TRADEOFFS IN MODELS FOR LOW-LATENCY SPEECH ENHANCEMENT
    Wilson, Kevin
    Chinen, Michael
    Thorpe, Jeremy
    Patton, Brian
    Hershey, John
    Saurous, Rif A.
    Skoglund, Jan
    Lyon, Richard F.
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 366 - 370
  • [7] Low-Latency Dynamic Adaptive Video Streaming
    Shuai, Yongtao
    Gorius, Manuel
    Herfet, Thorsten
    [J]. 2014 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2014,
  • [8] LOW-LATENCY SPEECH SEPARATION GUIDED DIARIZATION FOR TELEPHONE CONVERSATIONS
    Morrone, Giovanni
    Cornell, Samuele
    Raj, Desh
    Serafini, Luca
    Zovato, Enrico
    Brutti, Alessio
    Squartini, Stefano
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 641 - 646
  • [9] LOW-LATENCY APPROXIMATION OF BIDIRECTIONAL RECURRENT NETWORKS FOR SPEECH DENOISING
    Wichern, Gordon
    Lukin, Alexey
    [J]. 2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 66 - 70
  • [10] Low-latency transformer model for streaming automatic speech recognition
    Miao, Haoran
    Cheng, Gaofeng
    Zhang, Pengyuan
    [J]. ELECTRONICS LETTERS, 2022, 58 (01) : 44 - 46