Unsupervised training of an HMM-based Speech Recognizer for Topic Classification

被引:0
|
作者
Gish, Herbert [1 ]
Siu, Man-hung [1 ]
Chan, Arthur [1 ]
Belfield, Bill [1 ]
机构
[1] BBN Technol, Speech & Language Proc Dept, Cambridge, MA 02421 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
HMM-based Speech-To-Text (STT) systems are widely deployed not only for dictation tasks but also as the first processing stage of many automatic speech applications such as spoken topic classification. However, the necessity of transcribed data for training the HMMs precludes its use in domains where transcribed speech is difficult to come by because of the specific domain, channel or language. In this work, we propose building HMM-based speech recognizers without transcribed data by formulating the HMM training as an optimization over both the parameter and transcription sequence space. We describe how this can be easily implemented using existing STT tools. We tested the effectiveness of our unsupervised training approach on the task of topic classification on the Switchboard corpus. The unsupervised HMM recognizer, initialized with a segmental tokenizer, outperformed both the a HMM phoneme recognizer trained with I hour of transcribed data, and the Brno University of Technology (BUT) Hungarian phoneme recognizer. This approach can also be applied to other speech applications, including spoken term detection, language and speaker verification.
引用
下载
收藏
页码:1895 / 1898
页数:4
相关论文
共 50 条
  • [21] A training method of average voice model for HMM-based speech synthesis
    Yamagishi, J
    Tamura, M
    Masuko, T
    Tokuda, K
    Kobayashi, T
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2003, E86A (08) : 1956 - 1963
  • [22] TRAJECTORY TRAINING CONSIDERING GLOBAL VARIANCE FOR HMM-BASED SPEECH SYNTHESIS
    Toda, Tomoki
    Young, Steve
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4025 - +
  • [23] Speaker and Language Adaptive Training for HMM-Based Polyglot Speech Synthesis
    Zen, Heiga
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 410 - 413
  • [24] HMM-Based Vietnamese Speech Synthesis
    Trinh Quoc Son
    2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2015, : 349 - 353
  • [25] Robustness of HMM-based Speech Synthesis
    Yamagishi, Junichi
    Ling, Zhenhua
    King, Simon
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 581 - 584
  • [26] Czech HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 291 - 298
  • [27] An HMM-based speech recognition IC
    Han, W
    Hon, KW
    Chan, CF
    Lee, T
    Choy, CS
    Pun, KP
    Ching, PC
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II: COMMUNICATIONS-MULTIMEDIA SYSTEMS & APPLICATIONS, 2003, : 744 - 747
  • [28] Arabic HMM-based Speech Synthesis
    Khalil, Krichi Mohamed
    Adnan, Cherif
    2013 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND SOFTWARE APPLICATIONS (ICEESA), 2013, : 450 - 454
  • [29] HMM-Based Vietnamese Speech Synthesis
    Trinh, Son
    Hoang, Kiem
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2015, 3 (04) : 33 - 47
  • [30] Confidence-measure-driven unsupervised incremental adaptation for HMM-based speech recognition
    Charlet, D
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 357 - 360