Large Vocabulary Speech Recognition: Speaker Dependent and Speaker Independent

被引:0
|
作者
Hemakumar, G. [1 ]
Punitha, P. [2 ]
机构
[1] Govt Coll Women, Dept Comp Sci, Mandya, India
[2] PESIT, Dept MCA, Bangalore, Karnataka, India
关键词
Speaker independent; Speaker dependent; Normal fit; Baum-Welch algorithm;
D O I
10.1007/978-81-322-2250-7_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of large vocabulary isolated word and continuous Kannada speech recognition using the syllables and combination of Hidden Markov Model (HMM) and Normal fit method. The models designed for speaker dependent and speaker independent mode of working. This experiment has covered 6 million words among the 10 million words from Hampi text corpus. Here 3-state Baum-Welch algorithm is used for training. For the 2 successor outputted lambda(A, B, pi) is combined and passed into normal fit, the outputted normal fit parameter is labeled has syllable or sub-word. In terms of memory requirement and recognition rate the proposed model is compared with Gaussian Mixture Model and HMM (3-state Baum-Welch algorithm). This paper clearly shows that combination of HMM and normal fit technique will reduce the memory size while building and storing the speech models and works with excellent recognition rate. The average WRR is 91.22 % and average WER is 8.78 %. All computations are done using mat lab.
引用
收藏
页码:73 / 80
页数:8
相关论文
共 50 条
  • [1] ON LARGE-VOCABULARY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
    LEE, KF
    [J]. SPEECH COMMUNICATION, 1988, 7 (04) : 375 - 379
  • [2] DSP-based large vocabulary speaker-independent speech recognition
    Hirayama, H
    Yoshida, K
    Koga, S
    Hattori, H
    [J]. NEC RESEARCH & DEVELOPMENT, 1996, 37 (04): : 528 - 534
  • [3] ACOUSTIC MODELING OF SUBWORD UNITS FOR LARGE VOCABULARY SPEAKER INDEPENDENT SPEECH RECOGNITION
    LEE, CH
    RABINER, LR
    PIERACCINI, R
    WILPON, JG
    [J]. SPEECH AND NATURAL LANGUAGE, 1989, : 280 - 291
  • [4] On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition
    Huang, Xuedong
    Lee, Kai-Fu
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02): : 150 - 157
  • [5] Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems
    Padmanabhan, M
    Bahl, LR
    Nahamoo, D
    Picheny, MA
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 701 - 704
  • [6] Probabilistic Latent Speaker Training for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1225 - 1228
  • [7] Probabilistic Latent Speaker Analysis for Large Vocabulary Speech Recognition
    Su, Dan
    Wu, Xihong
    Chi, Huisheng
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1889 - 1892
  • [8] Speaker verification through large vocabulary continuous speech recognition
    Newman, M
    Gillick, L
    Ito, Y
    McAllaster, D
    Peskin, B
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2419 - 2422
  • [9] Speaker selection training for large vocabulary continuous speech recognition
    Huang, C
    Chen, T
    Chang, E
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 609 - 612
  • [10] Experiments in speaker normalisation and adaptation for large vocabulary speech recognition
    Pye, D
    Woodland, PC
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1047 - 1050