Speech-Based Emotion Analysis Using Log-Mel Spectrograms and MFCC Features

被引:0
|
作者
Yetkin, Ahmet Kemal [1 ]
Kose, Hatice [2 ]
机构
[1] Istanbul Tech Univ, Bilgisayar & Bilisim Fak, Istanbul, Turkiye
[2] Istanbul Tech Univ, Yapay Zeka & Veri Muhendisligi Bolumu, Istanbul, Turkiye
关键词
speech emotion recognition; machine learning; neural networks; log-Mel spectrogram; MFCC;
D O I
10.1109/SIU59756.2023.10223785
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a method for recognizing emotions from speech using Mel spectrograms and MFCC features which capture the spectral features of speech signals. To predict emotions from the extracted features from the dataset, Convolutional Neural Networks (CNNs) and finetune pre-trained models are used. Pre-trained models are fine-tuned with some optimizations and one-dimensional convolutional neural network is constructed. The results demonstrate that the proposed method achieved an accuracy rate of over 80% in predicting emotions from speech and show the effectiveness of the approach in a comparative manner.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Speech Emotion Recognition From 3D Log-Mel Spectrograms With Deep Learning Network
    Meng, Hao
    Yan, Tianhao
    Yuan, Fei
    Wei, Hongwei
    [J]. IEEE ACCESS, 2019, 7 : 125868 - 125881
  • [2] AN EXPLORATION OF LOG-MEL SPECTROGRAM AND MFCC FEATURES FOR ALZHEIMER'S DEMENTIA RECOGNITION FROM SPONTANEOUS SPEECH
    Meghanani, Amit
    Anoop, C. S.
    Ramakrishnan, A. G.
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 670 - 677
  • [3] On the Effect of Log-Mel Spectrogram Parameter Tuning for Deep Learning-Based Speech Emotion Recognition
    Mukhamediya, Azamat
    Fazli, Siamac
    Zollanvari, Amin
    [J]. IEEE ACCESS, 2023, 11 : 61950 - 61957
  • [4] Emotion Recognition in Speech Using MFCC and Wavelet Features
    Kishore, K. V. Krishna
    Satish, P. Krishna
    [J]. PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 842 - 847
  • [5] Speech Emotion Recognition Using ANN on MFCC Features
    Dolka, Harshit
    Xavier, Arul V. M.
    Juliet, Sujitha
    [J]. ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 431 - 435
  • [6] Leveraged Mel Spectrograms Using Harmonic and Percussive Components in Speech Emotion Recognition
    Rudd, David Hason
    Huo, Huan
    Xu, Guandong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT II, 2022, 13281 : 392 - 404
  • [7] Speech Emotion Recognition using MFCC features and LSTM network
    Kumbhar, Harshawardhan S.
    Bhandari, Sheetal U.
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [8] Heart Sound Classification Using Deep Learning Techniques Based on Log-mel Spectrogram
    Minh Tuan Nguyen
    Wei Wen Lin
    Jin H. Huang
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 344 - 360
  • [9] Heart Sound Classification Using Deep Learning Techniques Based on Log-mel Spectrogram
    Minh Tuan Nguyen
    Lin, Wei Wen
    Huang, Jin H.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 344 - 360
  • [10] Speech Based Human Emotion Recognition Using MFCC
    Likitha, M. S.
    Gupta, Raksha R.
    Hasitha, K.
    Raju, A. Upendra
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 2257 - 2260