CROSS-LINGUAL TOPIC PREDICTION FOR SPEECH USING TRANSLATIONS

被引:0
|
作者
Bansal, Sameer [1 ]
Kamper, Herman [2 ]
Lopez, Adam [1 ]
Goldwater, Sharon [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Stellenbosch Univ, Dept E&E Engn, Stellenbosch, South Africa
关键词
speech translation; low-resource speech processing; speech classification; unwritten languages; SPOKEN CONTENT; RECOGNITION; RETRIEVAL; LANGUAGE;
D O I
10.1109/icassp40776.2020.9054169
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Given a large amount of unannotated speech in a low-resource language, can we classify the speech utterances by topic? We consider this question in the setting where a small amount of speech in the low-resource language is paired with text translations in a high-resource language. We develop an effective cross-lingual topic classifier by training on just 20 hours of translated speech, using a recent model for direct speech-to-text translation. While the translations are poor, they are still good enough to correctly classify the topic of 1-minute speech segments over 70% of the time-a 20% improvement over a majority-class baseline. Such a system could be useful for humanitarian applications like crisis response, where incoming speech in a foreign low-resource language must be quickly assessed for further action.
引用
收藏
页码:8164 / 8168
页数:5
相关论文
共 50 条
  • [41] Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features
    Zhan, Qingran
    Motlicek, Petr
    Du, Shixuan
    Shan, Yahui
    Ma, Sifan
    Xie, Xiang
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1912 - 1916
  • [42] XTREME-S: Evaluating Cross-lingual Speech Representations
    Conneau, Alexis
    Bapna, Ankur
    Zhang, Yu
    Ma, Min
    von Platen, Patrick
    Lozhkov, Anton
    Cherry, Colin
    Jia, Ye
    Rivera, Clara
    Kale, Mihir
    Van Esch, Daan
    Axelrod, Vera
    Khanuja, Simran
    Clark, Jonathan H.
    Firat, Orhan
    Auli, Michael
    Ruder, Sebastian
    Riesa, Jason
    Johnson, Melvin
    [J]. INTERSPEECH 2022, 2022, : 3248 - 3252
  • [43] CROSS-LINGUAL SPEECH RECOGNITION UNDER RUNTIME RESOURCE CONSTRAINTS
    Yu, Dong
    Deng, Li
    Liu, Peng
    Wu, Jian
    Gong, Yifan
    Acero, Alex
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4193 - 4196
  • [44] CROSS-LINGUAL FRAME SELECTION METHOD FOR POLYGLOT SPEECH SYNTHESIS
    Chen, Chia-Ping
    Huang, Yi-Chin
    Wu, Chung-Hsien
    Lee, Kuan-De
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4521 - 4524
  • [45] IMPROVING CROSS-LINGUAL SPEECH SYNTHESIS WITH TRIPLET TRAINING SCHEME
    Ye, Jianhao
    Zhou, Hongbin
    Su, Zhiba
    He, Wendi
    Ren, Kaimeng
    Li, Lin
    Lu, Heng
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6072 - 6076
  • [46] Cross-Lingual Acoustic modeling for Dialectal Arabic Speech Recognition
    Elmahdy, Mohamed
    Gruhn, Rainer
    Minker, Wolfgang
    Abdennadher, Slim
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 873 - +
  • [47] CROSS-LINGUAL SPEECH-BASED TOBI LABEL GENERATION USING BIDIRECTIONAL LSTM
    Vetter, Marco
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6620 - 6624
  • [48] Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
    Chatzoudis, Gerasimos
    Plitsis, Manos
    Stamouli, Spyridoula
    Dimou, Athanasia-Lida
    Katsamanis, Nassos
    Katsouros, Vassilis
    [J]. INTERSPEECH 2022, 2022, : 2178 - 2182
  • [49] Speech Recognition for Turkic Languages Using Cross-Lingual Transfer Learning from Kazakh
    Orel, Daniil
    Yeshpanov, Rustem
    Varol, Huseyin Atakan
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, BIGCOMP, 2023, : 174 - 182
  • [50] Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network
    Cai, Xiong
    Wu, Zhiyong
    Zhong, Kuo
    Su, Bin
    Dai, Dongyang
    Meng, Helen
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,