Lattice Transformer for Speech Translation

被引：0

作者：

Zhang, Pei ^{[1
]}

Chen, Boxing ^{[1
]}

Ge, Niyu ^{[1
]}

Fan, Kai ^{[1
]}

机构：

[1] Alibaba Grp Inc, Hangzhou, Zhejiang, Peoples R China

来源：

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent advances in sequence modeling have highlighted the strengths of the transformer architecture, especially in achieving state-of-the-art machine translation results. However, depending on the up-stream systems, e.g., speech recognition, or word segmentation, the input to translation system can vary greatly. The goal of this work is to extend the attention mechanism of the transformer to naturally consume the lattice in addition to the traditional sequential input. We first propose a general lattice transformer for speech translation where the input is the output of the automatic speech recognition (ASR) which contains multiple paths and posterior scores. To leverage the extra information from the lattice structure, we develop a novel controllable lattice attention mechanism to obtain latent representations. On the LDC Spanish-English speech translation corpus, our experiments show that lattice transformer generalizes significantly better and outperforms both a transformer baseline and a lattice LSTM. Additionally, we validate our approach on the WMT 2017 Chinese-English translation task with lattice inputs from different BPE segmentations. In this task, we also observe the improvements over strong baselines.

引用

页码：6475 / 6484

页数：10

共 50 条

[1] Integration of speech recognition and machine translation: Speech recognition word lattice translation
Zhang, RQ
Kikui, G
SPEECH COMMUNICATION, 2006, 48 (3-4) : 321 - 334
[2] TRANSFORMER-BASED DIRECT SPEECH-TO-SPEECH TRANSLATION WITH TRANSCODER
Kano, Takatomo
Sakti, Sakriani
Nakamura, Satoshi
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 958 - 965
[3] Direct Vs Cascaded Speech-to-Speech Translation Using Transformer
Arya, Lalaram
Chowdhury, Amartya Roy
Prasanna, S. R. Mahadeva
SPEECH AND COMPUTER, SPECOM 2023, PT II, 2023, 14339 : 258 - 270
[4] STREAMING SIMULTANEOUS SPEECH TRANSLATION WITH AUGMENTED MEMORY TRANSFORMER
Ma, Xutai
Wang, Yongqiang
Dousti, Mohammad Javad
Koehn, Philipp
Pino, Juan
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7523 - 7527
[5] Implicit Memory Transformer for Computationally Efficient Simultaneous Speech Translation
Raffel, Matthew
Chen, Lizhong
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12900 - 12907
[6] Lattice-Based Transformer Encoder for Neural Machine Translation
Xiao, Fengshun
Li, Jiangtong
Zhao, Hai
Wang, Rui
Chen, Kehai
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3090 - 3097
[7] DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Fang, Qingkai
Zhou, Yan
Feng, Yang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Deng, Keqi
Watanabe, Shinji
Shi, Jiatong
Arora, Siddhant
INTERSPEECH 2022, 2022, : 1746 - 1750
[9] Lattice-based Viterbi decoding techniques for speech translation
Saon, George
Picheny, Michael
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 386 - 389
[10] SIMULTANEOUS SPEECH-TO-SPEECH TRANSLATION SYSTEM WITH TRANSFORMER-BASED INCREMENTAL ASR, MT, AND TTS
Fukuda, Ryo
Novitasari, Sashi
Oka, Yui
Kano, Yasumasa
Yano, Yuki
Ko, Yuka
Tokuyama, Hirotaka
Doi, Kosuke
Yanagita, Tomoya
Sakti, Sakriani
Sudoh, Katsuhito
Nakamura, Satoshi
2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 186 - 192

← 1 2 3 4 5 →