Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

被引:0
|
作者
Tomanek, Katrin [1 ]
Zayats, Vicky [1 ]
Padfield, Dirk [1 ]
Vaillancourt, Kara [1 ]
Biadsy, Fadi [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) systems are often optimized to work best for speakers with canonical speech patterns. Unfortunately, these systems perform poorly when tested on atypical speech and heavily accented speech. It has previously been shown that personalization through model fine-tuning substantially improves performance. However, maintaining such large models per speaker is costly and difficult to scale. We show that by adding a relatively small number of extra parameters to the encoder layers via socalled residual adapter, we can achieve similar adaptation gains compared to model finetuning, while only updating a tiny fraction (less than 0.5%) of the model parameters. We demonstrate this on two speech adaptation tasks (atypical and accented speech) and for two state-of-the-art ASR architectures.
引用
收藏
页码:6751 / 6760
页数:10
相关论文
共 42 条
  • [1] A COMPARISON OF PARAMETER-EFFICIENT ASR DOMAIN ADAPTATION METHODS FOR UNIVERSAL SPEECH AND LANGUAGE MODELS
    Sim, Khe Chai
    Huo, Zhouyuan
    Munkhdalai, Tsendsuren
    Siddhartha, Nikhil
    Stooke, Adam
    Meng, Zhong
    Li, Bo
    Sainath, Tara
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6900 - 6904
  • [2] Black-box Adaptation of ASR for Accented Speech
    Khandelwal, Kartik
    Jyothi, Preethi
    Awasthi, Abhijeet
    Sarawagi, Sunita
    INTERSPEECH 2020, 2020, : 1281 - 1285
  • [3] Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
    Yang, Li-Jen
    Yang, Chao-Han Huck
    Chien, Jen-Tzung
    INTERSPEECH 2023, 2023, : 4354 - 4358
  • [4] Parameter-Efficient Sparse Retrievers and Rerankers Using Adapters
    Pal, Vaishali
    Lassance, Carlos
    Dejean, Herve
    Clinchant, Stephane
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 16 - 31
  • [5] Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
    Lei, Tao
    Bai, Junwen
    Brahma, Siddhartha
    Ainslie, Joshua
    Lee, Kenton
    Zhou, Yanqi
    Du, Nan
    Zhao, Vincent Y.
    Wu, Yuexin
    Li, Bo
    Zhang, Yu
    Chang, Ming-Wei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] A Parameter-efficient Language Extension Framework for Multilingual ASR
    Liu, Wei
    Hou, Jingyong
    Yang, Dong
    Cao, Muyong
    Lee, Tan
    INTERSPEECH 2024, 2024, : 3929 - 3933
  • [7] LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
    Song, Zheshu
    Zhuoi, Jianheng
    Yang, Yifan
    Ma, Ziyang
    Zhang, Shixiong
    Chen, Xie
    INTERSPEECH 2024, 2024, : 3934 - 3938
  • [8] Parameter-Efficient Tuning with Special Token Adaptation
    Yang, Xiaocong
    Huang, James Y.
    Zhou, Wenxuan
    Chen, Muhao
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 865 - 872
  • [9] Parameter-Efficient Model Adaptation for Vision Transformers
    He, Xuehai
    Li, Chuanyuan
    Zhang, Pengchuan
    Yang, Jianwei
    Wang, Xin Eric
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 817 - 825
  • [10] Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters
    Xi, Yuxuan
    Li, Pengcheng
    Song, Yan
    Jiang, Yiheng
    Dai, Lirong
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 513 - 518