Frame-level phoneme classification using inductive inference

被引:2
|
作者
Samouelian, A [1 ]
机构
[1] UNIV SYDNEY,DEPT ELECT ENGN,SPEECH TECHNOL RES LAB,SYDNEY,NSW 2006,AUSTRALIA
来源
COMPUTER SPEECH AND LANGUAGE | 1997年 / 11卷 / 03期
关键词
D O I
10.1006/csla.1997.0029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel approach to frame-level classification by the use of inductive inference (decision trees). The proposed system (Samouelian, 1994a) uses the C4.5 induction system (Quinlan, 1993, 1996) to capture the knowledge about the structure and characteristics of the speech signal explicitly from the database. The decision tree is generated automatically from the training speech database. The database contains labelled examples in the form of a feature vector and its corresponding label for each frame. The feature vector may consist of any number of different feature sets and the label may be at the phoneme, sub-word or word level. This approach allows the integration of features from existing signal processing techniques that are currently used in stochastic modelling such as hidden Markov models (HMMs), and acoustic-phonetic features, which have been the cornerstone of traditional knowledge-based techniques. The aim of this research is to demonstrate that induction systems can provide a viable alternative automatic speech recognition technique by allowing the combination of features from any of the above feature representations to achieve optimum classification. Using C4.5, the results on five experiments are reported. The first four experiments use a small corpus of Australian English consonants (plosives, liquids and nasals) and four different feature sets, and they report on frame-level classification results for speaker-dependent and independent modes. The fifth experiment uses the TIMIT database and the mel frequency cepstral coefficient (MFCC) feature set and reports on frame-level classification results for speaker-independent experiments on the training data and test data. (C) 1997 Academic Press Limited.
引用
收藏
页码:161 / 186
页数:26
相关论文
共 50 条
  • [1] A DISCRIMINATIVELY TRAINED HOUGH TRANSFORM FOR FRAME-LEVEL PHONEME RECOGNITION
    Dennis, Jonathan
    Huy Dat Tran
    Li, Haizhou
    Chng, Eng Siong
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] A subspace approach for speech enhancement using frame-level AdaBoost classification
    Salman, A.
    Muhammad, E.
    Khurshid, K.
    [J]. 2007 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, 2007, : 122 - 127
  • [3] Robust Frame-Level Detection for Deepfake Videos With Lightweight Bayesian Inference Weighting
    Zhou, Linjiang
    Ma, Chao
    Wang, Zepeng
    Zhang, Yixuan
    Shi, Xiaochuan
    Wu, Libing
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (07) : 13018 - 13028
  • [4] Frame-Level Stutter Detection
    Harvill, John
    Hasegawa-Johnson, Mark
    Yoo, Changdong
    [J]. INTERSPEECH 2022, 2022, : 2843 - 2847
  • [5] Joint Phoneme Segmentation Inference and Classification using CRFs
    Palaz, Dimitri
    Magimai-Doss, Mathew
    Collobert, Ronan
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 587 - 591
  • [6] A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification
    Shahrebabaki, Abdolreza Sabzi
    Imran, Ali Shariq
    Olfati, Negar
    Svendsen, Torbjorn
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (08) : 3501 - 3520
  • [7] Frame-level hidden Markov models
    Tran, D
    Wagner, M
    [J]. ADVANCES IN INTELLIGENT SYSTEMS: THEORY AND APPLICATIONS, 2000, 59 : 252 - 259
  • [8] Enhancing frame-level student engagement classification through knowledge transfer techniques
    Das, Riju
    Dev, Soumyabrata
    [J]. APPLIED INTELLIGENCE, 2024, 54 (02) : 2261 - 2276
  • [9] A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification
    Abdolreza Sabzi Shahrebabaki
    Ali Shariq Imran
    Negar Olfati
    Torbjørn Svendsen
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 3501 - 3520
  • [10] Enhancing frame-level student engagement classification through knowledge transfer techniques
    Riju Das
    Soumyabrata Dev
    [J]. Applied Intelligence, 2024, 54 : 2261 - 2276