Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

被引:0
|
作者
Tomanek, Katrin [1 ]
Zayats, Vicky [1 ]
Padfield, Dirk [1 ]
Vaillancourt, Kara [1 ]
Biadsy, Fadi [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) systems are often optimized to work best for speakers with canonical speech patterns. Unfortunately, these systems perform poorly when tested on atypical speech and heavily accented speech. It has previously been shown that personalization through model fine-tuning substantially improves performance. However, maintaining such large models per speaker is costly and difficult to scale. We show that by adding a relatively small number of extra parameters to the encoder layers via socalled residual adapter, we can achieve similar adaptation gains compared to model finetuning, while only updating a tiny fraction (less than 0.5%) of the model parameters. We demonstrate this on two speech adaptation tasks (atypical and accented speech) and for two state-of-the-art ASR architectures.
引用
收藏
页码:6751 / 6760
页数:10
相关论文
共 42 条
  • [31] Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Action Recognition
    Bandara, Wele Gedara Chaminda
    Patel, Vishal M.
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [32] PEAFusion: Parameter-efficient Adaptation for RGB-Thermal fusion-based semantic segmentation
    Wang, Yan
    Chu, Henry K.
    Sun, Yuxiang
    INFORMATION FUSION, 2025, 120
  • [33] OpenDelta: A Plug-and -play Library for Parameter-efficient Adaptation of Pre-trained Models
    Hu, Shengding
    Ding, Ning
    Zhao, Weilin
    Lv, Xingtai
    Zhang, Zhen
    Liu, Zhiyuan
    Sun, Maosong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-DEMO 2023, VOL 3, 2023, : 274 - 281
  • [34] PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation
    Dong, Yue-Jiang
    Guo, Yuan-Chen
    Liu, Ying-Tian
    Zhang, Fang-Lue
    Zhang, Song-Hai
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1609 - 1617
  • [35] ERAT-DLoRA: Parameter-efficient tuning with enhanced range adaptation in time and depth aware dynamic LoRA
    Luo, Dan
    Zheng, Kangfeng
    Wu, Chunhua
    Wang, Xiujuan
    Wang, Jvjie
    NEUROCOMPUTING, 2025, 614
  • [36] Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
    Bai, Ye
    Li, Jie
    Han, Wenjing
    Ni, Hao
    Xu, Kaituo
    Zhang, Zhuo
    Yi, Cheng
    Wang, Xiaorui
    INTERSPEECH 2022, 2022, : 1676 - 1680
  • [37] DyLoRA: Parameter-Efficient Tuning of Pretrained Models using Dynamic Search-Free Low Rank Adaptation
    Valipour, Mojtaba
    Rezagholizadeh, Mehdi
    Kobyzev, Ivan
    Ghodsi, Ali
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3274 - 3287
  • [38] A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model
    Radhakrishnan, Srijith
    Yang, Chao-Han Huck
    Khan, Sumeer Ahmad
    Kiani, Narsis A.
    Gomez-Cabrero, David
    Tegner, Jesper N.
    INTERSPEECH 2023, 2023, : 1958 - 1962
  • [39] Efficient personalized mispronunciation detection of Taiwanese-accented English speech based on unsupervised model adaptation and dynamic sentence selection
    Wu, Chung-Hsien
    Su, Hung-Yu
    Liu, Chao-Hong
    COMPUTER ASSISTED LANGUAGE LEARNING, 2013, 26 (05) : 446 - 467
  • [40] Discriminative Sample-Guided and Parameter-Efficient Feature Space Adaptation for Cross-Domain Few-Shot Learning
    Perera, Rashindrie
    Halgamuge, Saman
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23794 - 23804