Automatic language identification with discriminative language characterization based on SVM

被引:9
|
作者
Suo, Hongbin [1 ]
Li, Ming [1 ]
Lu, Ping [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIT Speech Lab, Beijing 100864, Peoples R China
关键词
language identification; supervised speaker clustering; support vector machine; discriminative language characterization score vector; pair-wise posterior probability estimation;
D O I
10.1093/ietisy/e91-d.3.567
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust automatic language identification (LID) is the task of identifying the language from a short utterance spoken by an unknown speaker. The mainstream approaches include parallel phone recognition language modeling (PPRLM), support vector machine (SVM) and the general Gaussian mixture models (GMMs). These systems map the cepstral features of spoken utterances into high level scores by classifiers. In this paper, in order to increase the dimension of the score vector and alleviate the inter-speaker variability within the same language, multiple data groups based on supervised speaker clustering are employed to generate the discriminative language characterization score vectors (DLCSV). The back-end SVM classifiers are used to model the probability distribution of each target language in the DLCSV space. Finally, the output scores of back-end classifiers are calibrated by a pair-wise posterior probability estimation (PPPE) algorithm. The proposed language identification frameworks are evaluated on 2003 NIST Language Recognition Evaluation (LRE) databases and the experiments show that the system described in this paper produces comparable results to the existing systems. Especially, the SVM framework achieves an equal error rate (EER) of 4.0% in the 30-second task and outperforms the state-of-art systems by more than 30% relative error reduction. Besides, the performances of proposed PPRLM and GMMs algorithms achieve an EER of 5.1% and 5.0% respectively.
引用
收藏
页码:567 / 575
页数:9
相关论文
共 50 条
  • [1] Hierarchical Language Identification based on Automatic Language Clustering
    Yin, Bo
    Ambikairajah, Eliathamby
    Chen, Fang
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1217 - 1220
  • [2] Discriminative Features for Language Identification
    Alberti, Chris
    Bacchiani, Michiel
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2928 - 2931
  • [3] Language identification using discriminative weighted language models
    Wang, SZ
    Liu, J
    Liu, RS
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 53 - 56
  • [4] Improvements on Hierarchical Language Identification based on automatic language clustering
    Yin, Bo
    Ambikairajah, Eliathamby
    Chen, Fang
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4241 - 4244
  • [5] A Framework for Discriminative SVM/GMM Systems for Language Recognition
    Campbell, W. M.
    Karam, Z. N.
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2163 - 2166
  • [6] SVM-UBM based automatic language identification using a vowel-guided segmentation
    Peng, Tianqiang
    Zhang, Wenlin
    Li, Bicheng
    [J]. ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 1, PROCEEDINGS, 2007, : 310 - +
  • [7] Automatic language identification
    Zissman, MA
    Berkling, KM
    [J]. SPEECH COMMUNICATION, 2001, 35 (1-2) : 115 - 124
  • [8] DISCRIMINATIVE FEATURE EXTRACTION FOR LANGUAGE IDENTIFICATION
    Huang, Shuai
    Coppersmith, Glen A.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6862 - 6865
  • [9] Discriminative Score Fusion for Language Identification
    Zhang Weiqiang
    Hou Tao
    Liu Jia
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2010, 19 (01) : 124 - 128
  • [10] Segment-based automatic language identification
    Hazen, TJ
    Zue, VW
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 101 (04): : 2323 - 2331