SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding

被引：0

作者：

Chung, Yu-An ^{[1
]}

Zhu, Chenguang ^{[2
]}

Zeng, Michael ^{[2
]}

机构：

[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA

[2] Microsoft Cognit Serv Grp, Redmond, WA USA

来源：

2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Spoken language understanding (SLU) requires a model to analyze input acoustic signal to understand its linguistic content and make predictions. To boost the models' performance, various pre-training methods have been proposed to learn rich representations from large-scale unannotated speech and text. However, the inherent disparities between the two modalities necessitate a mutual analysis. In this paper, we propose a novel semi-supervised learning framework, SPLAT, to jointly pre-train the speech and language modules. Besides conducting a self-supervised masked language modeling task on the two individual modules using unpaired speech and text, SPLAT aligns representations from the two modules in a shared latent space using a small amount of paired speech and text. Thus, during fine-tuning, the speech module alone can produce representations carrying both acoustic information and contextual semantic knowledge of an input acoustic signal. Experimental results verify the effectiveness of our approach on various SLU tasks. For example, SPLAT improves the previous state-of-the-art performance on the Spoken SQuAD dataset by more than 10%.

引用

页码：1897 / 1907

页数：11

共 50 条

[31] Speech Characteristics in Female Students Training to Be Speech-Language Pathologists
D'haeseleer, Evelien
De Ley, Sophia
Cosyns, Marjan
Desomer, Els
De Mesel, Jasmien
Van Maele, George
Van Lierde, Kristiane
FOLIA PHONIATRICA ET LOGOPAEDICA, 2016, 68 (04) : 167 - 174
[32] Adaptive Training for Robust Spoken Language Understanding
Garcia, Fernando
Sanchis, Emilio
Hurtado, Lluis-F.
Segarra, Encarna
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 519 - 526
[33] Pre-training Language Models for Comparative Reasoning
Yu, Mengxia
Zhang, Zhihan
Yu, Wenhao
Jiang, Meng
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12421 - 12433
[34] Sigmoid Loss for Language Image Pre-Training
Zhai, Xiaohua
Mustafa, Basil
Kolesnikov, Alexander
Beyer, Lucas
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11941 - 11952
[35] MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding
Li, Junlong
Xu, Yiheng
Cui, Lei
Wei, Furu
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6078 - 6087
[36] Grounded Language-Image Pre-training
Li, Liunian Harold
Zhang, Pengchuan
Zhang, Haotian
Yang, Jianwei
Li, Chunyuan
Zhong, Yiwu
Wang, Lijuan
Yuan, Lu
Zhang, Lei
Hwang, Jenq-Neng
Chang, Kai-Wei
Gao, Jianfeng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10955 - 10965
[37] VILA: On Pre-training for Visual Language Models
Lin, Ji
Yin, Hongxu
Ping, Wei
Molchanov, Pavlo
Shoeybi, Mohammad
Han, Song
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26679 - 26689
[38] RELATION ENHANCED VISION LANGUAGE PRE-TRAINING
Lee, Ju-Hee
Kang, Je-Won
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2286 - 2290
[39] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Jian, Yiren
Gao, Chongyang
Vosoughi, Soroush
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[40] Pre-Training Language Models for Identifying Patronizing and Condescending Language: An Analysis
Perez-Almendros, Carla
Espinosa-Anke, Luis
Schockaert, Steven
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3902 - 3911

← 1 2 3 4 5 →