Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition

被引:0
|
作者
Li, Xiangang [1 ]
Su, Dan [1 ]
Pang, Zaihu [1 ]
Wu, Xihong [1 ]
机构
[1] Peking Univ, Speech & Hearing Res Ctr, Key Lab Machine Percept, Minist Educ, Beijing 100871, Peoples R China
关键词
speech recognition; probabilistic latent speaker analysis; speaker clustering; speaker-class;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a probabilistic speaker-class (PSC) based acoustic modeling method is proposed for taking into account speaker variability influence in HMM-based speech recognition systems. Firstly, within the context of speaker-class based speech recognition, an experiment is conducted to investigate the performance of speaker-class recognition based on hard-cut speaker clustering. Then, in the proposed method, through introducing the probabilistic latent speaker analysis, the speaker-class dependent acoustic models are trained based on a soft-decision speaker clustering method, and combined by the distribution of speaker-class in the decoding phase. The experiments were conducted on a 600-hour speech corpus, and showed improvement in a large vocabulary continuous speech recognition task.
引用
收藏
页码:1218 / 1221
页数:4
相关论文
共 50 条
  • [1] ACOUSTIC MODELING OF SUBWORD UNITS FOR LARGE VOCABULARY SPEAKER INDEPENDENT SPEECH RECOGNITION
    LEE, CH
    RABINER, LR
    PIERACCINI, R
    WILPON, JG
    [J]. SPEECH AND NATURAL LANGUAGE, 1989, : 280 - 291
  • [2] Probabilistic Latent Speaker Training for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1225 - 1228
  • [3] Probabilistic Latent Speaker Analysis for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1889 - 1892
  • [4] Speech Recognition with Large-Scale Speaker-Class-Based Acoustic Modeling
    Konno, Kazuki
    Kato, Masaharu
    Kosaka, Tetsuo
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [5] Speaker verification through large vocabulary continuous speech recognition
    Newman, M
    Gillick, L
    Ito, Y
    McAllaster, D
    Peskin, B
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422
  • [6] Speaker selection training for large vocabulary continuous speech recognition
    Huang, C
    Chen, T
    Chang, E
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 609 - 612
  • [7] Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
    Kosaka, Tetsuo
    Takeda, Yuui
    Ito, Takashi
    Kato, Masaharu
    Kohda, Masaki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2363 - 2369
  • [8] Speaker adaptation in the philips system for large vocabulary continuous speech recognition
    Thelen, E
    Aubert, X
    Beyerlein, P
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1035 - 1038
  • [9] ON LARGE-VOCABULARY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
    LEE, KF
    [J]. SPEECH COMMUNICATION, 1988, 7 (04) : 375 - 379
  • [10] Deep Neural Network-Based Speech Recognition with Combination of Speaker-Class Models
    Kosaka, Tetsuo
    Konno, Kazuki
    Kato, Masaharu
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1203 - 1206