Using probabilistic characteristic vector based on both phonetic and prosodic features for language identification

被引:0
|
作者
Hosseini Amereei S.A. [1 ]
Homayounpour M.M. [1 ]
机构
[1] Laboratory for Intelligent Sound and Speech Processing, Amirkabir University of Technology, Tehran
关键词
APRLM; GPRLM; Language identification; Pitch contour polynomial approximation; Probabilistic sequence kernel; Support vector machine;
D O I
10.1109/ISTEL.2010.5734122
中图分类号
学科分类号
摘要
Language identification (LID) is an important task in indexing of audio signals. This paper introduces a LID system with a generative frontend based on both phonetic and prosodic features. The generative frontend is built upon an ensemble of Gaussian densities. Half of these Gaussian densities are trained to represent elementary speech sound units and the others are trained to represent prosodic properties that both characterize a wide variety of languages. Shifted Delta Cepstral (SDC) and Pitch Contour Polynomial Approximation (PCPA) are used as feature. The backend classifier is Support Vector Machine (SVM). Several language identification experiments were conducted and the proposed improvements were evaluated using OGI-MLTS corpus. Using SVM with (Generalized Linear Discriminant Analysis) GLDS and Probabilistic Sequence Kernel (PSK) outperforms GMM where all systems are based on PCPA, and improves LID performance about 2.1% and 5.9% respectively. Furthermore, something in the region of 4% improvement was achieved by combining both phonetic and prosodic features in our four languages identification experiments. © 2010 IEEE.
引用
收藏
页码:750 / 754
页数:4
相关论文
共 50 条
  • [1] Language identification using phonetic and prosodic HMMs with feature normalization
    Obuchi, Y
    Sato, N
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 569 - 572
  • [2] Prosodic features for language identification
    Mary, Leena
    Yegnanarayana, B.
    ICSCN 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING COMMUNICATIONS AND NETWORKING, 2008, : 57 - +
  • [3] Language Identification System using MFCC and Prosodic Features
    Bhattacharjee, Utpal
    KshirodSarmah
    2013 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING (ISSP), 2013, : 194 - 197
  • [4] Sparse Representation based Language Identification using Prosodic Features for Indian Languages
    Singh, Om Prakash
    Haris, B. C.
    Sinha, Rohit
    Chettri, Bhusan
    Pradhan, Abhishek
    2013 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2013,
  • [5] Analysis and Selection of Prosodic Features for Language Identification
    Ng, Raymond W. M.
    Lee, Tan
    Leung, Cheung-Chi
    Ma, Bin
    Li, Haizhou
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 123 - 128
  • [6] Combining cepstral and prosodic features in language identification
    Yin, Bo
    Ambikairajah, Eliathamby
    Chen, Fang
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 254 - +
  • [7] Neural network classifiers for language identification using phonotactic and prosodic features
    Mary, L
    Rao, KS
    Yegnanarayana, B
    2005 INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSING, PROCEEDINGS, 2005, : 404 - 408
  • [8] automatic language identification for berber and arabic languages using prosodic features
    Lounnas, Khlaed
    Demri, Lyes
    Teffahi, Hocine
    Falek, Leila
    PROCEEDINGS 2018 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SCIENCES AND TECHNOLOGIES IN MAGHREB (CISTEM), 2018, : 239 - 242
  • [9] Speaker Verification and Spoken Language Identification using a Generalized I-vector Framework with Phonetic Tokenizations and Tandem Features
    Li, Ming
    Liu, Wenbo
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1120 - 1124
  • [10] Using prosodic features in language models for meetings
    Huang, Songfang
    Renals, Steve
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2008, 4892 : 192 - 203