A Speech Emotion Recognition Model Based on Multi-Level Local Binary and Local Ternary Patterns

被引:19
|
作者
Sonmez, Yesim Ulgen [1 ]
Varol, Asaf [2 ]
机构
[1] Firat Univ, Fac Technol, Dept Software Engn, TR-23119 Elazig, Turkey
[2] Maltepe Univ, Fac Engn & Nat Sci, Dept Comp Engn, TR-34857 Istanbul, Turkey
来源
IEEE ACCESS | 2020年 / 8卷
关键词
Feature extraction; Time-frequency analysis; Classification algorithms; Databases; Transforms; Support vector machines; Discrete wavelet transform; local binary pattern; local ternary pattern; neighborhood component analysis; speech emotion recognition; FEATURE-EXTRACTION; FEATURE-SELECTION; CLASSIFICATION; FEATURES;
D O I
10.1109/ACCESS.2020.3031763
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Interpreting a speech signal is quite challenging because it consists of different frequencies and features that vary according to emotions. Although different algorithms are being developed in the speech emotion recognition (SER) domain, the success rates vary according to the spoken languages, emotions, and databases. In this study, a new lightweight effective SER method has been developed that has low computational complexity. This method, called 1BTPDN, is applied on RAVDESS, EMO-DB, SAVEE, and EMOVO databases. First, low-pass filter coefficients are obtained by applying a one-dimensional discrete wavelet transform on the raw audio data. The features are extracted by applying textural analysis methods, a one-dimensional local binary pattern, and a one-dimensional local ternary pattern to each filter. Using neighborhood component analysis, the most dominant 1024 features are selected from 7680 features while the other features are discarded. These 1024 features are selected as the input of the classifier which is a third-degree polynomial kernel-based support vector machine. The success rates of the 1BTPDN reached 95.16% 89.16%, 76.67%, and 74.31%; in the RAVDESS, EMO-DB, SAVEE, and EMOVO databases, respectively. The recognition rates are higher compared to many textural, acoustic, and deep learning state-of-the-art SER methods.
引用
收藏
页码:190784 / 190796
页数:13
相关论文
共 50 条
  • [1] An Ensemble Model for Multi-Level Speech Emotion Recognition
    Zheng, Chunjun
    Wang, Chunli
    Jia, Ning
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (01):
  • [2] Speech Emotion Recognition based on Multi-Level Residual Convolutional Neural Networks
    Zheng, Kai
    Xia, ZhiGuang
    Zhang, Yi
    Xu, Xuan
    Fu, Yaqin
    [J]. ENGINEERING LETTERS, 2020, 28 (02) : 559 - 565
  • [3] Speech Emotion Recognition via Multi-Level Attention Network
    Liu, Ke
    Wang, Dekui
    Wu, Dongya
    Liu, Yutao
    Feng, Jun
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2278 - 2282
  • [4] SPEECH EMOTION RECOGNITION WITH CO-ATTENTION BASED MULTI-LEVEL ACOUSTIC INFORMATION
    Zou, Heqing
    Si, Yuke
    Chen, Chen
    Rajan, Deepu
    Chng, Eng Siong
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7367 - 7371
  • [5] Low-Order Multi-Level Features for Speech Emotion Recognition
    Tamulevicius, Gintautas
    Liogiene, Tatjana
    [J]. BALTIC JOURNAL OF MODERN COMPUTING, 2015, 3 (04): : 234 - 247
  • [6] Multiscale Local Binary Patterns for Facial Expression-Based Human Emotion Recognition
    Nigam, Swati
    Khare, Ashish
    [J]. COMPUTATIONAL VISION AND ROBOTICS, 2015, 332 : 71 - 77
  • [7] Face Recognition based on Multi-level Histogram Sequence Center-symmetric Local Binary Pattern and Fisherface
    Xu, Xiaoyu
    Li, Su
    Liu, Lan
    [J]. 2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 448 - 451
  • [8] Multi-level Local Binary Pattern Analysis for Texture Characterization
    Suguna, R.
    Anandhakumar, P.
    [J]. ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, 2011, 198 : 375 - 386
  • [9] A Survey on Facial Recognition based on Local Directional and Local Binary Patterns
    Chengeta, Kennedy
    Viriri, Serestina
    [J]. 2018 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2018,
  • [10] Knowledge enhancement for speech emotion recognition via multi-level acoustic feature
    Zhao, Huan
    Huang, Nianxin
    Chen, Haijiao
    [J]. CONNECTION SCIENCE, 2024, 36 (01)