Phoneme sequence recognition via DTW-based classification

被引:6
|
作者
Hamooni, Hossein [1 ]
Mueen, Abdullah [1 ]
Neel, Amy [2 ]
机构
[1] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
[2] Univ New Mexico, Dept Speech & Hearing Sci, Albuquerque, NM 87131 USA
关键词
Phoneme classification; DTW-based classification; Phonetic time series; Big data; Sequence recognition;
D O I
10.1007/s10115-015-0885-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phonemes are the smallest units of sound produced by a human being. Automatic classification of phonemes is a well-researched topic in linguistics due to its potential for robust speech recognition. With the recent advancement of phonetic segmentation algorithms, it is now possible to generate datasets of millions of phonemes automatically. Phoneme classification on such datasets is a challenging data mining task because of the large number of classes (over a hundred) and complexities of the existing methods. In this paper, we introduce the phoneme classification problem as a data mining task. We propose a dual-domain (time and frequency) hierarchical classification algorithm. Our method uses a dynamic time warping (DTW)-based classifier in the top layers and time-frequency features in the lower layer. We cross-validate our method on phonemes from three online dictionaries and achieved up to 35 % improvement in classification compared with existing techniques. We further modify our vowel classifier by adopting DTW distance over time-frequency coefficients and gain an additional 3 % improvement. We provide case studies on classifying accented phonemes and speaker-invariant phoneme classification. Finally, we show a demonstration of how phoneme classification can be used to recognize speech.
引用
收藏
页码:253 / 275
页数:23
相关论文
共 50 条
  • [1] Phoneme sequence recognition via DTW-based classification
    Hossein Hamooni
    Abdullah Mueen
    Amy Neel
    [J]. Knowledge and Information Systems, 2016, 48 : 253 - 275
  • [2] Visual Place Recognition by DTW-based sequence alignment
    Hafez, A. H. Abdul
    Tello, Ammar
    Alqaraleh, Saed
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [3] Research on Isolated word Recognition with DTW-based
    Xu, Lijun
    Ke, Minyi
    [J]. PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 139 - 141
  • [4] DTW-based feature selection for speech recognition and speaker recognition
    Liu, Jing-Wei
    Xu, Mei-Zhi
    Zheng, Zhong-Guo
    Cheng, Qian-Sheng
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2005, 18 (01): : 50 - 54
  • [5] The DTW-based representation space for seismic pattern classification
    Orozco-Alzate, Mauricio
    Alexandra Castro-Cabrera, Paola
    Bicego, Manuele
    Makario Londono-Bonilla, John
    [J]. COMPUTERS & GEOSCIENCES, 2015, 85 : 86 - 95
  • [6] Signal enhancement and efficient DTW-based comparison for wearable gait recognition
    Avola, Danilo
    Cinque, Luigi
    De Marsico, Maria
    Fagioli, Alessio
    Foresti, Gian Luca
    Mancini, Maurizio
    Mecca, Alessio
    [J]. COMPUTERS & SECURITY, 2024, 137
  • [7] Research and Improvement on Embedded System Application of DTW-based Speech Recognition
    Wan, Chun
    Liu, Lili
    [J]. 2008 2ND INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY AND IDENTIFICATION, 2008, : 401 - 404
  • [8] Cross-words reference template for DTW-based speech recognition systems
    Abdulla, WH
    Chow, D
    Sin, G
    [J]. IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 1576 - 1579
  • [9] Robust DTW-based recognition algorithm for hand-held consumer devices
    Kim, C
    Seo, KD
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2005, 51 (02) : 699 - 709
  • [10] Robust DTW-based recognition algorithm for hand-held consumer devices
    Kim, C
    Seo, K
    [J]. ICCE: 2005 INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, DIGEST OF TECHNICAL PAPERS, 2005, : 433 - 434