Direct Segmentation Models for Streaming Speech Translation

被引：0

作者：

Iranzo-Sanchez, Javier ^{[1
]}

Pastor, Adria Gimenez ^{[1
]}

Silvestre-Cerda, Joan Albert ^{[1
]}

Baquero-Arnal, Pau ^{[1
]}

Civera, Jorge ^{[1
]}

Juan, Alfons ^{[1
]}

机构：

[1] Univ Politcn Valncia, Machine Learning & Language Proc MLLP Res Grp, Valencian Res Inst Artificial Intelligence VRAIN, Valencia, Spain

来源：

PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Automatic Speech Recognition (ASR) system followed by a Machine Translation (MT) system. These systems are usually connected by a segmenter that splits the ASR output into, hopefully, semantically self-contained chunks to be fed into the MT system. This is specially challenging in the case of streaming ST, where latency requirements must also be taken into account. This work proposes novel segmentation models for streaming ST that incorporate not only textual, but also acoustic information to decide when the ASR output is split into a chunk. An extensive and thorough experimental setup is carried out on the Europarl-ST dataset to prove the contribution of acoustic information to the performance of the segmentation model in terms of BLEU score in a streaming ST scenario. Finally, comparative results with previous work also show the superiority of the segmentation models proposed in this work.

引用

页码：2599 / 2611

页数：13

共 50 条

[31] Recent Advances in Direct Speech-to-text Translation
Xu, Chen
Ye, Rong
Dong, Qianqian
Zhao, Chengqi
Ko, Tom
Wang, Mingxuan
Xiao, Tong
Zhu, Jingbo
[J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6796 - 6804
[32] Speechformer: Reducing Information Loss in Direct Speech Translation
Papi, Sara
Gaido, Marco
Negri, Matteo
Turchi, Marco
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1698 - 1706
[33] DIRECT SPEECH TRANSLATION PROBLEMS IN VIETNAMESE LITERARY TEXTS
Britov, I., V
[J]. RUSSIAN JOURNAL OF VIETNAMESE STUDIES-VYETNAMSKIYE ISSLEDOVANIYA, 2018, (01): : 123 - 148
[34] Translation, direct quotation and decontextualisation (Reported speech, process of translation, cultural criteria)
Slembrouck, S
[J]. PERSPECTIVES-STUDIES IN TRANSLATION THEORY AND PRACTICE, 1999, 7 (01): : 81 - 108
[35] Speech segmentation and interpretation using a semantic syntax-directed translation
De Mori, R.
Giordana, Attilio
Laface, Pietro
[J]. PATTERN RECOGNITION LETTERS, 1982, 1 (02) : 121 - 124
[36] Sequence-to-Sequence Models for Emphasis Speech Translation
Quoc Truong Do
Sakti, Sakriani
Nakamura, Satoshi
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1873 - 1883
[37] How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation
Gaido, Marco
Savoldi, Beatrice
Bentivogli, Luisa
Negri, Matteo
Turchi, Marco
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3576 - 3589
[38] TRANSLATION OF HELIUM SPEECH BY METHOD OF SEGMENTATION, PARTIAL-REJECTION AND EXPANSION
SUZUKI, J
NAKATSUI, M
TAKASUGI, T
TANAKA, R
[J]. JOURNAL OF THE RADIO RESEARCH LABORATORY, 1977, 24 (113): : 1 - 16
[39] SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Tsiamas, Ioannis
Gallego, Gerard I.
Fonollosa, Jose A. R.
Costa-jussa, Marta R.
[J]. INTERSPEECH 2022, 2022, : 106 - 110
[40] Similarities in fundamental frequency in infant speech segmentation models
Marklund, Ellen
Lacerda, Francisco
Schwarz, Iris-Corinna
Sundberg, Ulla
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1110 - 1113

← 1 2 3 4 5 →