Model-Agnostic Meta-Learning for Fast Text-Dependent Speaker Embedding Adaptation

被引:2
|
作者
Lin, Weiwei [1 ]
Mak, Man-Wai [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep speaker embedding; text-dependent speaker verification; meta-learning; model adaptation; MAML;
D O I
10.1109/TASLP.2023.3275029
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
By constraining the lexical content of input speech, text-dependent speaker verification (TD-SV) offers more reliable performance than text-independent speaker verification (TI-SV) when dealing with short utterances. Because speech with constrained lexical content is harder to collect, often TD models are fine-tuned from a TI model using a small target phrase dataset. However, sometimes the target phrase dataset is too tiny for fine-tuning, which is the main obstacle for deploying TD-SV. One solution is to fine-tune the model using medium-size multi-phrase TD data and then deploy the model on the target phrase. Although this strategy does help in some cases, the performance is still sub-optimal because the model is not optimized for the target phrase. Inspired by the recent progress in meta-learning, we propose a three-stage pipeline for adapting a TI model to a TD model for the target phrase. Firstly, a TI model is trained using a large amount of speech data. Then, we use a multi-phrase TD dataset to tune the TI model via model-agnostic meta-learning. Finally, we perform fast adaptation using a small target phrase dataset. Results show that the three-stage pipeline consistently outperforms multi-phrase and target phrase fine-tuning.
引用
收藏
页码:1866 / 1876
页数:11
相关论文
共 50 条
  • [1] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
    Finn, Chelsea
    Abbeel, Pieter
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Probabilistic Model-Agnostic Meta-Learning
    Finn, Chelsea
    Xu, Kelvin
    Levine, Sergey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [3] Bayesian Model-Agnostic Meta-Learning
    Yoon, Jaesik
    Kim, Taesup
    Dia, Ousmane
    Kim, Sungwoong
    Bengio, Yoshua
    Ahn, Sungjin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [4] Domain-Invariant Speaker Vector Projection by Model-Agnostic Meta-Learning
    Kang, Jiawen
    Liu, Ruiqi
    Li, Lantian
    Cai, Yunqi
    Wang, Dong
    Zheng, Thomas Fang
    [J]. INTERSPEECH 2020, 2020, : 3825 - 3829
  • [5] Cross Domain Adaptation of Crowd Counting with Model-Agnostic Meta-Learning
    Hou, Xiaoyu
    Xu, Jihui
    Wu, Jinming
    Xu, Huaiyu
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [6] Knowledge Distillation for Model-Agnostic Meta-Learning
    Zhang, Min
    Wang, Donglin
    Gai, Sibo
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1355 - 1362
  • [7] Meta weight learning via model-agnostic meta-learning
    Xu, Zhixiong
    Chen, Xiliang
    Tang, Wei
    Lai, Jun
    Cao, Lei
    [J]. NEUROCOMPUTING, 2021, 432 : 124 - 132
  • [8] Deep Embedding Learning for Text-Dependent Speaker Verification
    Zhang, Peng
    Hu, Peng
    Zhang, Xueliang
    [J]. INTERSPEECH 2020, 2020, : 3461 - 3465
  • [9] Fast Adaptation of Deep Models for Facial Action Unit Detection Using Model-Agnostic Meta-Learning
    Lee, Mihee
    Rudovic, Ognjen
    Pavlovic, Vladimir
    Pantic, Maja
    [J]. WORKSHOP ON ARTIFICIAL INTELLIGENCE IN AFFECTIVE COMPUTING, VOL 122, 2019, 122 : 9 - 27
  • [10] Combining Model-Agnostic Meta-Learning and Transfer Learning for Regression
    Satrya, Wahyu Fadli
    Yun, Ji-Hoon
    [J]. SENSORS, 2023, 23 (02)