Limited Labels for Unlimited Data: Active Learning for Speaker Recognition

被引:0
|
作者
Shum, Stephen H. [1 ]
Dehak, Najim [1 ]
Glass, James R. [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
speaker recognition; i-vectors; active learning; VERIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we attempt to quantify the amount of labeled data necessary to build a state-of-the-art speaker recognition system. We begin by using i-vectors and the cosine similarity metric to represent an unlabeled set of utterances, then obtain labels from a noiseless oracle in the form of pairwise queries. Finally, we use the resulting speaker clusters to train a PLDA scoring function, which is assessed on the 2010 NIST Speaker Recognition Evaluation. After presenting the initial results of an algorithm that sorts queries based on nearest-neighbor pairs, we develop techniques that further minimize the number of queries needed to obtain state-of-the-art performance. We show the generalizability of our methods in anecdotal fashion by applying our methods to two different distributions of utterances-per-speaker and, ultimately, find that the actual number of pairwise labels needed to obtain state-of-the-art results may be a mere fraction of the queries required to fully label the entire set of utterances.
引用
收藏
页码:383 / 387
页数:5
相关论文
共 50 条
  • [1] Automatic Speaker Recognition with Limited Data
    Li, Ruirui
    Jiang, Jyun-Yu
    Liu, Jiahao
    Hsieh, Chu-Cheng
    Wang, Wei
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 340 - 348
  • [2] Training speaker recognition systems with limited data
    Vaessen, Nik
    van Leeuwen, David A.
    [J]. INTERSPEECH 2022, 2022, : 4760 - 4764
  • [3] SPEAKER RECOGNITION IN NOISY CONDITIONS WITH LIMITED TRAINING DATA
    McLaughlin, Niall
    Ming, Ji
    Crookes, Danny
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1294 - 1298
  • [4] Comparison of Generative and Discriminative Approaches for Speaker Recognition with Limited Data
    Silovsky, Jan
    Cerva, Petr
    Zdansky, Jindrich
    [J]. RADIOENGINEERING, 2009, 18 (03) : 307 - 316
  • [5] Speaker recognition under limited data condition by noise addition
    Krishnamoorthy, P.
    Jayanna, H. S.
    Prasanna, S. R. M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (10) : 13487 - 13490
  • [6] Optimized Active Learning Strategy for Audiovisual Speaker Recognition
    Karlos, Stamatis
    Kaleris, Konstantinos
    Fazakis, Nikos
    Kanas, Vasileios G.
    Kotsiantis, Sotiris
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 281 - 290
  • [7] Learning foreign labels from a foreign speaker: the role of (limited) exposure to a second language
    Akhtar, Nameera
    Menjivar, Jennifer
    Hoicka, Elena
    Sabbagh, Mark A.
    [J]. JOURNAL OF CHILD LANGUAGE, 2012, 39 (05) : 1135 - 1149
  • [8] Representation Learning From Limited Educational Data With Crowdsourced Labels
    Wang, Wentao
    Xu, Guowei
    Ding, Wenbiao
    Huang, Gale Yan
    Li, Guoliang
    Tang, Jiliang
    Liu, Zitao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2886 - 2898
  • [9] Approximating Learning Curves for Imbalanced Big Data with Limited Labels
    Richter, Aaron N.
    Khoshgoftaar, Taghi M.
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 237 - 242
  • [10] Fuzzy vector quantization for speaker recognition under limited data conditions
    Jayanna, H. S.
    Prasanna, S. R. Mahadeva
    [J]. 2008 IEEE REGION 10 CONFERENCE: TENCON 2008, VOLS 1-4, 2008, : 124 - 127