Likelihood-Based Semi-Supervised Model Selection With Applications to Speech Processing

被引:0
|
作者
White, Christopher M. [1 ]
Khudanpur, Sanjeev P. [2 ]
Wolfe, Patrick J. [3 ]
机构
[1] Johns Hopkins Univ, HLT COE, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[3] Harvard Univ, Stat & Informat Sci Lab, Cambridge, MA 02138 USA
关键词
Likelihood ratio tests; pronunciation modeling; robust statistics; semi-supervised learning; sign test; speech recognition; spoken term detection;
D O I
10.1109/JSTSP.2010.2076050
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some other means. In the context of speech processing systems and other large-scale practical applications, however, such labeled development data are typically costly and difficult to obtain. This paper investigates an alternative semi-supervised framework for likelihood-based model selection that leverages unlabeled data by using trained classifiers representing each model to automatically generate putative labels. The errors that result from this automatic labeling are shown to be amenable to results from robust statistics, which in turn provide for minimax-optimal censored likelihood ratio tests that recover the nonparametric sign test as a limiting case. This approach is then validated experimentally using a state-of-the-art automatic speech recognition system to select between candidate word pronunciations using unlabeled speech data that only potentially contain instances of the words under test. Results provide supporting evidence for the utility of this approach, and suggest that it may also find use in other applications of machine learning.
引用
收藏
页码:1016 / 1026
页数:11
相关论文
共 50 条
  • [21] An Effective Semi-supervised Model for Intrusion Detection Using Feature Selection Based LapSVM
    Zhang, Xiaofeng
    Tian, Jianwei
    Zhu, Peidong
    Zhang, Jiexin
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS (IEEE CITS), 2017, : 284 - 287
  • [22] Semi-supervised acoustic model training for speech with code-switching
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    SPEECH COMMUNICATION, 2018, 105 : 12 - 22
  • [23] Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection
    Ang, Jun Chin
    Mirzal, Andri
    Haron, Habibollah
    Hamed, Haza Nuzly Abdull
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 971 - 989
  • [24] A semi-supervised learning approach for model selection based on class-hypothesis testing
    Gorriz, Juan M.
    Ramirez, Javier
    Suckling, John
    Martinez-Murcia, F. J.
    Illan, I. A.
    Segovia, F.
    Ortiz, A.
    Salas-Gonzalez, D.
    Castillo-Barnes, D.
    Puntonet, C. G.
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 90 : 40 - 49
  • [25] Manifold Based Fisher Method for Semi-Supervised Feature Selection
    Lv, Sunzhong
    Jiang, Hongxing
    Zhao, Li
    Wang, Di
    Fan, Mingyu
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 664 - 668
  • [26] IMPROVING SEMI-SUPERVISED CLASSIFICATION FOR LOW-RESOURCE SPEECH INTERACTION APPLICATIONS
    Kumar, Manoj
    Papadopoulos, Pavlos
    Travadi, Ruchir
    Bone, Daniel
    Narayanan, Shrikanth
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5149 - 5153
  • [27] Clustering-based Feature Selection in Semi-supervised Problems
    Quinzan, Ianisse
    Sotoca, Jose M.
    Pla, Filiberto
    2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 535 - 540
  • [28] Semi-Supervised Local-Learning-based Feature Selection
    Wang, Jim Jing-Yan
    Yao, Jin
    Sun, Yijun
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1942 - 1948
  • [29] Semi-supervised feature selection based on local discriminative information
    Zeng, Zhiqiang
    Wang, Xiaodong
    Zhang, Jian
    Wu, Qun
    NEUROCOMPUTING, 2016, 173 : 102 - 109
  • [30] Semi-Supervised Clustering Ensemble Based on Cluster Consensus Selection
    Liu, Yanxi
    Al-Khafaji, Ali Hussein Demin
    CYBERNETICS AND SYSTEMS, 2025, 56 (03) : 213 - 241