Locally Discriminant Diffusion Projection and Its Application in Speech Emotion Recognition

被引:1
|
作者
Xu, Xinzhou [1 ]
Huang, Chengwei [2 ]
Wu, Chen [1 ]
Zhao, Li [3 ]
机构
[1] Southeast Univ, Minist Educ, Key Lab Underwater Acoust Signal Proc, Nanjing, Jiangsu, Peoples R China
[2] Soochow Univ, Sch Phys Sci & Technol, Suzhou, Peoples R China
[3] Soochow Univ, Minist Educ, Key Lab Child Dev & Learning Sci, Key Lab Underwater Acoust Signal Proc, Suzhou, Peoples R China
关键词
diffusion maps; graph embedding framework; locally discriminant diffusion projection; speech emotion recognition; DIMENSIONALITY REDUCTION; FRAMEWORK; FEATURES;
D O I
10.7305/automatika.2016.07.853
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The existing Diffusion Maps method brings diffusion to data samples by Markov random walk. In this paper, to provide a general solution form of Diffusion Maps, first, we propose the generalized single-graph-diffusion embedding framework on the basis of graph embedding framework. Second, by designing the embedding graph of the framework, an algorithm, namely Locally Discriminant Diffusion Projection (LDDP), is proposed for speech emotion recognition. This algorithm is the projection form of the improved Diffusion Maps, which includes both discriminant information and local information. The linear or kernelized form of LDDP (i.e., LLDDP or KLDDP) is used to achieve the dimensionality reduction of original speech emotion features. We validate the proposed algorithm on two widely used speech emotion databases, EMO-DB and eNTERFACE'05. The experimental results show that the proposed LDDP methods, including LLDDP and KLDDP, outperform some other state-of-the-art dimensionality reduction methods which are based on graph embedding or discriminant analysis.
引用
收藏
页码:37 / 45
页数:9
相关论文
共 50 条
  • [21] Coupled Discriminant Subspace Alignment for Cross-database Speech Emotion Recognition
    Li, Shaokai
    Song, Peng
    Zhao, Keke
    Zhang, Wenjing
    Zheng, Wenming
    INTERSPEECH 2022, 2022, : 4695 - 4699
  • [22] Transferable discriminant linear regression for cross-corpus speech emotion recognition
    Li, Shaokai
    Song, Peng
    Zhang, Wenjing
    APPLIED ACOUSTICS, 2022, 197
  • [23] Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition
    Song, Peng
    Ou, Shifeng
    Du, Zhenbin
    Guo, Yanyan
    Ma, Wenming
    Liu, Jinglei
    Zheng, Wenming
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (05) : 1136 - 1139
  • [24] Linear discriminant analysis using a generalized mean of class covariances and its application to speech recognition
    Sakai, Makoto
    Kitaoka, Norihide
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 478 - 487
  • [25] Articulation constrained learning with application to speech emotion recognition
    Shah, Mohit
    Tu, Ming
    Berisha, Visar
    Chakrabarti, Chaitali
    Spanias, Andreas
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [26] Application of probabilistic neural network for speech emotion recognition
    Deshmukh S.
    Gupta P.
    International Journal of Speech Technology, 2024, 27 (01) : 19 - 28
  • [27] Articulation constrained learning with application to speech emotion recognition
    Mohit Shah
    Ming Tu
    Visar Berisha
    Chaitali Chakrabarti
    Andreas Spanias
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [28] Locally discriminant projection with Kernels for feature extraction
    Li, Jun-Bao
    Chu, Shu-Chuan
    Pan, Jeng-Shyang
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 586 - +
  • [29] Emotion Recognition From Speech Using Fisher's Discriminant Analysis and Bayesian Classifier
    Atasoy, Huseyin
    Yildirim, Serdar
    Yildirim, Esen
    2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 2513 - 2516
  • [30] Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Zhang, Weijian
    Song, Peng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 307 - 318