Frame-level phoneme classification using inductive inference

被引:2
|
作者
Samouelian, A [1 ]
机构
[1] UNIV SYDNEY,DEPT ELECT ENGN,SPEECH TECHNOL RES LAB,SYDNEY,NSW 2006,AUSTRALIA
来源
COMPUTER SPEECH AND LANGUAGE | 1997年 / 11卷 / 03期
关键词
D O I
10.1006/csla.1997.0029
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel approach to frame-level classification by the use of inductive inference (decision trees). The proposed system (Samouelian, 1994a) uses the C4.5 induction system (Quinlan, 1993, 1996) to capture the knowledge about the structure and characteristics of the speech signal explicitly from the database. The decision tree is generated automatically from the training speech database. The database contains labelled examples in the form of a feature vector and its corresponding label for each frame. The feature vector may consist of any number of different feature sets and the label may be at the phoneme, sub-word or word level. This approach allows the integration of features from existing signal processing techniques that are currently used in stochastic modelling such as hidden Markov models (HMMs), and acoustic-phonetic features, which have been the cornerstone of traditional knowledge-based techniques. The aim of this research is to demonstrate that induction systems can provide a viable alternative automatic speech recognition technique by allowing the combination of features from any of the above feature representations to achieve optimum classification. Using C4.5, the results on five experiments are reported. The first four experiments use a small corpus of Australian English consonants (plosives, liquids and nasals) and four different feature sets, and they report on frame-level classification results for speaker-dependent and independent modes. The fifth experiment uses the TIMIT database and the mel frequency cepstral coefficient (MFCC) feature set and reports on frame-level classification results for speaker-independent experiments on the training data and test data. (C) 1997 Academic Press Limited.
引用
收藏
页码:161 / 186
页数:26
相关论文
共 50 条
  • [41] Video expression recognition based on frame-level attention mechanism
    Chen R.
    Tong Y.
    Zhang Y.
    Xu B.
    [J]. High Technology Letters, 2023, 29 (02) : 130 - 139
  • [42] FPANet: Frequency-based video demoiréing using frame-level post alignment
    Oh, Gyeongrok
    Kim, Sungjune
    Gu, Heon
    Yoon, Sang Ho
    Kim, Jinkyu
    Kim, Sangpil
    [J]. Neural Networks, 2025, 184
  • [43] A Modified Speaking Rate Estimation Based on Frame-Level LSTM
    Xiao, Yanhong
    Du, Shixuan
    Xie, Xiang
    Wang, Jing
    Zhan, Qingran
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 600 - 603
  • [44] Frame-level nonverbal feature enhancement based sentiment analysis
    Zheng, Cangzhi
    Peng, Junjie
    Wang, Lan
    Zhu, Li'an
    Guo, Jiatao
    Cai, Zesu
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [45] Exploiting detected visual objects for frame-level video filtering
    Xingzhong Du
    Hongzhi Yin
    Zi Huang
    Yi Yang
    Xiaofang Zhou
    [J]. World Wide Web, 2018, 21 : 1259 - 1284
  • [46] A frame-level measurement apparatus for performance testing of ATM equipment
    Angrisani, L
    Baccigalupi, A
    D'Angiolo, G
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2003, 52 (01) : 20 - 26
  • [47] Frame-level temporal calibration of video sequences from unsynchronized cameras by using projective invariants
    Velipasalar, S
    Wolf, W
    [J]. AVSS 2005: Advanced Video and Signal Based Surveillance, Proceedings, 2005, : 462 - 467
  • [48] Performance Evaluation of Frame-level Parallelization in HEVC Intra Coding Using Heterogeneous Multicore Platforms
    Mohamed, Maazouz
    Nejmeddine, Bahri
    Noureddine, Batel
    Abdelmoughni, Toubal
    Nouri, Masmoudi
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON APPLIED SMART SYSTEMS (ICASS), 2018,
  • [49] Frame-level data reuse for motion-compensated temporal filtering
    Chen, Ching-Yeh
    Chen, Yi-Hau
    Cheng, Chih-Chi
    Chen, Liang-Gee
    [J]. 2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 5571 - +
  • [50] Frame-level temporal calibration of video sequences from unsynchronized cameras
    Senem Velipasalar
    Wayne H. Wolf
    [J]. Machine Vision and Applications, 2008, 19 : 395 - 409