A STUDY ON THE INTEGRATION OF PRE-TRAINED SSL, ASR, LM AND SLU MODELS FOR SPOKEN LANGUAGE UNDERSTANDING

被引:8
|
作者
Peng, Yifan [1 ]
Arora, Siddhant [1 ]
Higuchi, Yosuke [1 ]
Ueda, Yushi [1 ]
Kumar, Sujay [1 ]
Ganesan, Karthik [1 ]
Dalmia, Siddharth [1 ]
Chang, Xuankai [1 ]
Watanabe, Shinji [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
spoken language understanding; low resource; pre-trained models;
D O I
10.1109/SLT54892.2023.10022399
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collecting sufficient labeled data for spoken language understanding (SLU) is expensive and time-consuming. Recent studies achieved promising results by using pre-trained models in low-resource scenarios. Inspired by this, we aim to ask: which (if any) pre-training strategies can improve performance across SLU benchmarks? To answer this question, we employ four types of pre-trained models and their combinations for SLU. We leverage self-supervised speech and language models (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations. We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora. We conduct extensive experiments on the SLU Evaluation (SLUE) benchmark and observe self-supervised pre-trained models to be more powerful, with pre-trained LM and speech models being most beneficial for the Sentiment Analysis and Named Entity Recognition task, respectively.
引用
收藏
页码:406 / 413
页数:8
相关论文
共 50 条
  • [21] Integration of pre-trained protein language models into geometric deep learning networks
    Wu, Fang
    Wu, Lirong
    Radev, Dragomir
    Xu, Jinbo
    Li, Stan Z.
    COMMUNICATIONS BIOLOGY, 2023, 6 (01)
  • [22] Knowledge Inheritance for Pre-trained Language Models
    Qin, Yujia
    Lin, Yankai
    Yi, Jing
    Zhang, Jiajie
    Han, Xu
    Zhang, Zhengyan
    Su, Yusheng
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
  • [23] Code Execution with Pre-trained Language Models
    Liu, Chenxiao
    Lu, Shuai
    Chen, Weizhu
    Jiang, Daxin
    Svyatkovskiy, Alexey
    Fu, Shengyu
    Sundaresan, Neel
    Duan, Nan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4984 - 4999
  • [24] Probing for Hyperbole in Pre-Trained Language Models
    Schneidermann, Nina Skovgaard
    Hershcovich, Daniel
    Pedersen, Bolette Sandford
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211
  • [25] Pre-trained language models in medicine: A survey *
    Luo, Xudong
    Deng, Zhiqi
    Yang, Binxia
    Luo, Michael Y.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
  • [26] LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
    Shah, Dhruv
    Osinski, Blazej
    Ichter, Brian
    Levine, Sergey
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 492 - 504
  • [27] Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding
    Arora, Siddhant
    Futami, Hayato
    Kashiwagi, Yosuke
    Tsunoo, Emiru
    Yan, Brian
    Watanabe, Shinji
    INTERSPEECH 2023, 2023, : 720 - 724
  • [28] EFFICIENT UTILIZATION OF LARGE PRE-TRAINED MODELS FOR LOW RESOURCE ASR
    Vieting, Peter
    Luescher, Christoph
    Dierkes, Julian
    Schlueter, Ralf
    Ney, Hermann
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [29] C2A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding
    Cheng, Xuxin
    Yao, Ziyu
    Zhu, Zhihong
    Li, Yaowei
    Li, Hongxiang
    Zou, Yuexian
    INTERSPEECH 2023, 2023, : 695 - 699
  • [30] Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition
    Yu, Fu-Hao
    Chen, Kuan-Yu
    Lu, Ke-Han
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1474 - 1482