Unsupervised training of an HMM-based Speech Recognizer for Topic Classification

被引:0
|
作者
Gish, Herbert [1 ]
Siu, Man-hung [1 ]
Chan, Arthur [1 ]
Belfield, Bill [1 ]
机构
[1] BBN Technol, Speech & Language Proc Dept, Cambridge, MA 02421 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
HMM-based Speech-To-Text (STT) systems are widely deployed not only for dictation tasks but also as the first processing stage of many automatic speech applications such as spoken topic classification. However, the necessity of transcribed data for training the HMMs precludes its use in domains where transcribed speech is difficult to come by because of the specific domain, channel or language. In this work, we propose building HMM-based speech recognizers without transcribed data by formulating the HMM training as an optimization over both the parameter and transcription sequence space. We describe how this can be easily implemented using existing STT tools. We tested the effectiveness of our unsupervised training approach on the task of topic classification on the Switchboard corpus. The unsupervised HMM recognizer, initialized with a segmental tokenizer, outperformed both the a HMM phoneme recognizer trained with I hour of transcribed data, and the Brno University of Technology (BUT) Hungarian phoneme recognizer. This approach can also be applied to other speech applications, including spoken term detection, language and speaker verification.
引用
收藏
页码:1895 / 1898
页数:4
相关论文
共 50 条
  • [1] Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery
    Siu, Man-Hung
    Gish, Herbert
    Chan, Arthur
    Belfield, William
    Lowe, Steve
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 210 - 223
  • [2] Improved Topic Classification and Keyword Discovery using an HMM-based Speech Recognizer Trained without Supervision
    Siu, Man-Hung
    Gish, Herbert
    Chan, Arthur
    Belfield, William
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2842 - 2845
  • [3] FPGA Architecture of HMM-based Decoder Module in Speech Recognizer
    Trang Hoang
    Viet Vo Quoc
    Truong Nguyen Ly Thien
    [J]. 2012 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2012, : 354 - 358
  • [4] Unsupervised adaptation for HMM-based speech synthesis
    King, Simon
    Tokuda, Keiichi
    Zen, Heiga
    Yamagishi, Junichi
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1869 - +
  • [5] An HMM-based speech recognizer using overlapping articulatory features
    Erler, K
    Freeman, GH
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (04): : 2500 - 2513
  • [6] HMM-based speech recognizer using overlapping articulatory features
    Erler, Kevin
    Freeman, George H.
    [J]. Journal of the Acoustical Society of America, 1996, 100 (4 pt 1):
  • [7] Experiments on a parametric nonlinear spectral warping for an HMM-based speech recognizer
    Mashao, DJ
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 17 - 20
  • [8] Normalized training for HMM-based visual speech recognition
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    Kitamura, Tadashi
    Kobayashi, Takao
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (11): : 40 - 50
  • [9] Normalized training for HMM-based visual speech recognition
    Nankaku, Y
    Tokuda, K
    Kitamura, T
    Kobayashi, T
    [J]. 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 234 - 237
  • [10] An improved training algorithm in HMM-based speech recognition
    Li, GJ
    Huong, TY
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1057 - 1060