A hierarchical language identification system for Indian languages

被引:41
|
作者
Jothilakshmi, S. [1 ]
Ramalingam, V. [1 ]
Palanivel, S. [1 ]
机构
[1] Annamalai Univ, Dept Comp Sci & Engn, Annamalainagar 608002, Tamil Nadu, India
关键词
Language identification; Mel frequency cepstral coefficients; Shifted delta cepstral coefficients; Hidden Markov model; Gaussian mixture model; Neural networks; Indian languages; SPOKEN; RECOGNITION; SPEECH;
D O I
10.1016/j.dsp.2011.11.008
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic spoken Language IDentification (LID) is the task of identifying the language from a short duration of speech signal uttered by an unknown speaker. In this work, an attempt has been made to develop a two level language identification system for Indian languages using acoustic features. In the first level, the system identifies the family of the spoken language, and then it is fed to the second level which aims at identifying the particular language in the corresponding family. The performance of the system is analyzed for various acoustic features and different classifiers. The suitable acoustic feature and the pattern classification model are suggested for effective identification of Indian languages. The system has been modeled using hidden Markov model (HMM), Gaussian mixture model (GMM) and artificial neural networks (ANN). We studied the discriminative power of the system for the features mel frequency cepstral coefficients (MFCC). MFCC with delta and acceleration coefficients and shifted delta cepstral (SDC) coefficients. Then the LID performance as a function of the different training and testing set sizes has been studied. To carry out the experiments, a new database has been created for 9 Indian languages. It is shown that GMM based LID system using MFCC with delta and acceleration coefficients is performing well with 80.56% accuracy. The performance of GMM based LID system with SDC is also considerable. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:544 / 553
页数:10
相关论文
共 50 条
  • [41] Improved Text Language Identification for the South African Languages
    Duvenhage, Bernardt
    Ntini, Mfundo
    Ramonyai, Phala
    2017 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS (PRASA-ROBMECH), 2017, : 214 - 218
  • [42] Script identification and language detection of 12 Indian languages using DWT and template matching of Frequently Occurring Character(s)
    Sarungbam, Jeelen Kumar
    Kumar, Bhupendra
    Choudhary, Ankur
    Proceedings of the 5th International Conference on Confluence 2014: The Next Generation Information Technology Summit, 2014, : 669 - 674
  • [43] Script Identification and Language Detection of 12 Indian Languages using DWT and Template Matching of Frequently Occurring Character(s)
    Sarungbam, Jeelen Kumar
    Kumar, Bhupendra
    Choudhary, Ankur
    2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 669 - 674
  • [44] On Hierarchical Text Language-Identification Algorithms
    Hasimu, Maimaitiyiming
    Silamu, Wushour
    ALGORITHMS, 2018, 11 (04)
  • [45] Hierarchical Multilayer Perceptron based Language Identification
    Imseng, David
    Magimai-Doss, Mathew
    Bourlard, Herve
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2730 - 2733
  • [46] LANGUAGE CHANGE IN SOUTH-AMERICAN INDIAN LANGUAGES - KEY,MR
    POTTIER, B
    HOMME, 1993, 33 (2-4): : 579 - 581
  • [47] LANGUAGE CHANGE IN SOUTH-AMERICAN INDIAN LANGUAGES - KEY,MR
    CATRILEO, M
    ESTUDIOS FILOLOGICOS, 1993, (28): : 138 - 140
  • [48] INTERNAL LANGUAGE COLONIALISM IN MEXICO, DIMINUTION OF INDIAN LANGUAGES IN DAILY COMMUNICATION
    HAMEL, RE
    LILI-ZEITSCHRIFT FUR LITERATURWISSENSCHAFT UND LINGUISTIK, 1992, 22 (85): : 116 - 149
  • [49] Language shift and maintenance in multilingual Mauritius: the case of Indian ancestral languages
    Bissoonauth, Anu
    JOURNAL OF MULTILINGUAL AND MULTICULTURAL DEVELOPMENT, 2011, 32 (05) : 421 - 434
  • [50] An SVM Based Approach to Cross-Language Adaptation for Indian Languages
    Raju, A. Vijaya Rama
    Sekhar, C. Chandra
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT II, 2009, 5507 : 394 - 401