Likelihood-Based Semi-Supervised Model Selection With Applications to Speech Processing

被引:0
|
作者
White, Christopher M. [1 ]
Khudanpur, Sanjeev P. [2 ]
Wolfe, Patrick J. [3 ]
机构
[1] Johns Hopkins Univ, HLT COE, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[3] Harvard Univ, Stat & Informat Sci Lab, Cambridge, MA 02138 USA
关键词
Likelihood ratio tests; pronunciation modeling; robust statistics; semi-supervised learning; sign test; speech recognition; spoken term detection;
D O I
10.1109/JSTSP.2010.2076050
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some other means. In the context of speech processing systems and other large-scale practical applications, however, such labeled development data are typically costly and difficult to obtain. This paper investigates an alternative semi-supervised framework for likelihood-based model selection that leverages unlabeled data by using trained classifiers representing each model to automatically generate putative labels. The errors that result from this automatic labeling are shown to be amenable to results from robust statistics, which in turn provide for minimax-optimal censored likelihood ratio tests that recover the nonparametric sign test as a limiting case. This approach is then validated experimentally using a state-of-the-art automatic speech recognition system to select between candidate word pronunciations using unlabeled speech data that only potentially contain instances of the words under test. Results provide supporting evidence for the utility of this approach, and suggest that it may also find use in other applications of machine learning.
引用
收藏
页码:1016 / 1026
页数:11
相关论文
共 50 条
  • [1] A semi-supervised RUL prediction with likelihood-based pseudo labeling for suspension histories
    Takayama, Ryosuke
    Natsumeda, Masanao
    Yairi, Takehisa
    2023 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT, ICPHM, 2023, : 296 - 303
  • [2] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376
  • [3] Semi-supervised Model for Emotion Recognition in Speech
    Pereira, Ingryd
    Santos, Diego
    Maciel, Alexandre
    Barros, Pablo
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 791 - 800
  • [4] Semi-supervised generative model with applications
    An, Dezhi
    Lu, Jun
    Wu, Guangli
    Zheng, Shengcai
    Li, Yan
    Journal of Computational Information Systems, 2015, 11 (05): : 1809 - 1816
  • [5] Semi-supervised model selection based on cross-validation
    Kaariainen, Matti
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1894 - 1899
  • [6] Safe semi-supervised learning based on weighted likelihood
    Kawakita, Masanori
    Takeuchi, Jun'ichi
    NEURAL NETWORKS, 2014, 53 : 146 - 164
  • [7] Active Model Selection for Graph-Based Semi-Supervised Learning
    Zhao, Bin
    Wang, Fei
    Zhang, Changshui
    Song, Yangqiu
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1881 - 1884
  • [8] The Model Selection for Semi-supervised Support Vector Machines
    Zhao, Ying
    Zhang, Jian-pei
    Yang, Jing
    ICICSE: 2008 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING IN SCIENCE AND ENGINEERING, PROCEEDINGS, 2008, : 102 - 105
  • [9] LIKELIHOOD-BASED MODEL SELECTION FOR STOCHASTIC BLOCK MODELS
    Wang, Y. X. Rachel
    Bickel, Peter J.
    ANNALS OF STATISTICS, 2017, 45 (02): : 500 - 528
  • [10] Semi-Supervised Learning of Speech Sounds
    Jansen, Aren
    Niyogi, Partha
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2264 - 2267