Using probabilistic characteristic vector based on both phonetic and prosodic features for language identification

被引:0
|
作者
Hosseini Amereei S.A. [1 ]
Homayounpour M.M. [1 ]
机构
[1] Laboratory for Intelligent Sound and Speech Processing, Amirkabir University of Technology, Tehran
关键词
APRLM; GPRLM; Language identification; Pitch contour polynomial approximation; Probabilistic sequence kernel; Support vector machine;
D O I
10.1109/ISTEL.2010.5734122
中图分类号
学科分类号
摘要
Language identification (LID) is an important task in indexing of audio signals. This paper introduces a LID system with a generative frontend based on both phonetic and prosodic features. The generative frontend is built upon an ensemble of Gaussian densities. Half of these Gaussian densities are trained to represent elementary speech sound units and the others are trained to represent prosodic properties that both characterize a wide variety of languages. Shifted Delta Cepstral (SDC) and Pitch Contour Polynomial Approximation (PCPA) are used as feature. The backend classifier is Support Vector Machine (SVM). Several language identification experiments were conducted and the proposed improvements were evaluated using OGI-MLTS corpus. Using SVM with (Generalized Linear Discriminant Analysis) GLDS and Probabilistic Sequence Kernel (PSK) outperforms GMM where all systems are based on PCPA, and improves LID performance about 2.1% and 5.9% respectively. Furthermore, something in the region of 4% improvement was achieved by combining both phonetic and prosodic features in our four languages identification experiments. © 2010 IEEE.
引用
收藏
页码:750 / 754
页数:4
相关论文
共 50 条
  • [31] A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition
    Juneja, Amit
    Espy-Wilson, Carol
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (02): : 1154 - 1168
  • [32] GMM based language identification system using robust features
    Manchala, Sadanandam
    Prasad, V.
    Janaki, V.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (02) : 99 - 105
  • [33] Determining Native Language and Deception Using Phonetic Features and Classifier Combination
    Gosztolya, Gabor
    Grosz, Tamos
    Busa-Fekete, Robert
    Toth, Laszlo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2418 - 2422
  • [34] Voice Conversion by Mapping the Spectral and Prosodic Features Using Support Vector Machine
    Laskar, Rabul Hussain
    Talukdar, Fazal Ahmed
    Bhattacharjee, Rajib
    Das, Saugat
    APPLICATIONS OF SOFT COMPUTING: FROM THEORY TO PRAXIS, 2009, 58 : 519 - 528
  • [35] Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers
    Chittaragi, Nagaratna B.
    Prakash, Ambareesh
    Koolagudi, Shashidhar G.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (08) : 4289 - 4302
  • [36] Entropy-Based Sentence Selection for Speech Synthesis Using Phonetic and Prosodic Contexts
    Nose, Takashi
    Arao, Yusuke
    Kobayashi, Takao
    Sugiura, Komei
    Shiga, Yoshinori
    Ito, Akinori
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3491 - 3495
  • [37] Dialect Identification Using Spectral and Prosodic Features on Single and Ensemble Classifiers
    Nagaratna B. Chittaragi
    Ambareesh Prakash
    Shashidhar G. Koolagudi
    Arabian Journal for Science and Engineering, 2018, 43 : 4289 - 4302
  • [38] Language Identification Using Visual Features
    Newman, Jacob L.
    Cox, Stephen J.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 1936 - 1947
  • [39] Spoken Language Identification Using Language Bottleneck Features
    Grisard, Malo
    Motlicek, Petr
    Allouchi, Wissem
    Baeriswyl, Michael
    Lazaridis, Alexandros
    Zhan, Qingran
    TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 373 - 381
  • [40] PROSODIC FEATURES AND FORMANT MODELING FOR AN IVECTOR-BASED LANGUAGE RECOGNITION SYSTEM
    Martinez, David
    Lleida, Eduardo
    Ortega, Alfonso
    Miguel, Antonio
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6847 - 6851