ANALYZING ASR PRETRAINING FOR LOW-RESOURCE SPEECH-TO-TEXT TRANSLATION

被引:0
|
作者
Stoian, Mihaela C. [1 ]
Bansal, Sameer [1 ]
Goldwater, Sharon [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
关键词
speech-to-text translation; transfer learning; pretraining; speech recognition; data augmentation; NEURAL-NETWORKS; RECURRENT;
D O I
10.1109/icassp40776.2020.9053847
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Previous work has shown that for low-resource source languages, automatic speech-to-text translation (AST) can be improved by pretraining an end-to-end model on automatic speech recognition (ASR) data from a high-resource language. However, it is not clear what factors-e.g., language relatedness or size of the pretraining data-yield the biggest improvements, or whether pretraining can be effectively combined with other methods such as data augmentation. Here, we experiment with pretraining on datasets of varying sizes, including languages related and unrelated to the AST source language. We find that the best predictor of final AST performance is the word error rate of the pretrained ASR model, and that differences in ASR/AST performance correlate with how phonetic information is encoded in the later RNN layers of our model. We also show that pretraining and data augmentation yield complementary benefits for AST.
引用
收藏
页码:7909 / 7913
页数:5
相关论文
共 50 条
  • [21] CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
    Wang, Changhan
    Pino, Juan
    Wu, Anne
    Gu, Jiatao
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4197 - 4203
  • [22] End-to-End Speech-to-Text Translation: A Survey
    Sethiya, Nivedita
    Maurya, Chandresh Kumar
    [J]. Computer Speech and Language, 2025, 90
  • [23] AlloST: Low-resource Speech Translation without Source Transcription
    Cheng, Yao-Fei
    Lee, Hung-Shin
    Wang, Hsin-Min
    [J]. INTERSPEECH 2021, 2021, : 2252 - 2256
  • [24] Low-Resource Compositional Semantic Parsing with Concept Pretraining
    Rongali, Subendhu
    Sridhar, Mukund
    Khan, Haidar
    Arkoudas, Konstantine
    Hamza, Wael
    McCallum, Andrew
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1410 - 1419
  • [25] NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022
    Fukuda, Ryo
    Ko, Yuka
    Kano, Yasumasa
    Doi, Kosuke
    Tokuyama, Hirotaka
    Sakti, Sakriani
    Sudoh, Katsuhito
    Nakamura, Satoshi
    [J]. IWSLT 2022 - 19th International Conference on Spoken Language Translation, Proceedings of the Conference, 2022, : 286 - 292
  • [26] Crowdsourcing Latin American Spanish for Low-Resource Text-to-Speech
    Guevara-Rukoz, Adriana
    Demirsahin, Isin
    He, Fei
    Chu, Shan-Hui Cathy
    Sarin, Supheakmungkol
    Pipatsrisawat, Knot
    Gutkin, Alexander
    Butryna, Alena
    Kjartansson, Oddur
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6504 - 6513
  • [27] LOW-RESOURCE EXPRESSIVE TEXT-TO-SPEECH USING DATA AUGMENTATION
    Huybrechts, Goeric
    Merritt, Thomas
    Comini, Giulia
    Perz, Bartek
    Shah, Raahil
    Lorenzo-Trueba, Jaime
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6593 - 6597
  • [28] IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
    Denisov, Pavel
    Mager, Manuel
    Ngoc Thang Vu
    [J]. IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 175 - 181
  • [29] Part-of-Speech Tags Guide Low-Resource Machine Translation
    Kadeer, Zaokere
    Yi, Nian
    Wumaier, Aishan
    [J]. ELECTRONICS, 2023, 12 (16)
  • [30] Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation
    Deng, Pan
    Zhang, Jie
    Zhou, Xinyuan
    Ye, Zhongyi
    Zhang, Weitai
    Cui, Jianwei
    Dai, Lirong
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 954 - 959