Multimodal Affect Classification at Various Temporal Lengths

被引:20
|
作者
Kim, Jonathan C. [1 ]
Clements, Mark A. [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
Audio-visual emotion recognition; classifier fusion; speech analysis; EMOTION RECOGNITION; ACOUSTIC PROFILES; VOCAL EXPRESSIONS;
D O I
10.1109/TAFFC.2015.2411273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Earlier studies have shown that certain emotional characteristics are best observed at different analysis-frame lengths. When features of multiple modalities are extracted, it is reasonable to believe that different temporal lengths would better model the underlying characteristics that result from different emotions. In this study, we examine the use of such differing timescales in constructing emotion classifiers. A novel fusion method is introduced that utilizes the outputs of individual classifiers that are trained using multi-dimensional inputs with multiple temporal lengths. We used the IEMOCAP database which contains audiovisual information of 10 subjects in dyadic interaction settings. The classification task was performed over three emotional dimensions: valence, activation, and dominance. The results demonstrate the utility of the multimodal-multitemporal approach. Statistically significant improvements in accuracy are seen for in all three dimensions when compared with unimodal-unitemporal classifiers.
引用
收藏
页码:371 / 384
页数:14
相关论文
共 50 条
  • [1] HOW VARIOUS PARAMETERS AFFECT ETHYLENE CRACKER RUN LENGTHS
    MOL, A
    HYDROCARBON PROCESSING, 1974, 53 (07): : 115 - 118
  • [2] Does temporal asynchrony affect multimodal curvature detection?
    Winges, Sara A.
    Eonta, Stephanie E.
    Soechting, John F.
    EXPERIMENTAL BRAIN RESEARCH, 2010, 203 (01) : 1 - 9
  • [3] Does temporal asynchrony affect multimodal curvature detection?
    Sara A. Winges
    Stephanie E. Eonta
    John F. Soechting
    Experimental Brain Research, 2010, 203 : 1 - 9
  • [4] LEVERAGING LOCAL TEMPORAL INFORMATION FOR MULTIMODAL SCENE CLASSIFICATION
    Sahu, Saurabh
    Goyal, Palash
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1830 - 1834
  • [5] Multimodal Affect Recognition Using Temporal Convolutional Neural Networks
    Ayoub, Issa
    Heiries, Vincent
    Al Osman, Hussein
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] Simplified classification in multimodal affect detection using vocal and facial information
    Wei, Yunyun
    Sun, Xiangran
    INFORMATION TECHNOLOGY, 2015, : 331 - 336
  • [7] Effectiveness of Different Preprocessing Techniques on Classification of Various Lengths of Control Charts Patterns
    Lavangnananda, Kittichai
    Waiwing, Suthruthai
    7TH INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION TECHNOLOGY, 2015, 69 : 44 - 54
  • [8] Multimodal spatio-temporal framework for real-world affect recognition
    Raut, Karishma
    Kulkarni, Sujata
    Sawant, Ashwini
    International Journal of Intelligent Networks, 2024, 5 : 340 - 350
  • [9] Modality-invariant temporal representation learning for multimodal sentiment classification
    Sun, Hao
    Liu, Jiaqing
    Chen, Yen-Wei
    Lin, Lanfen
    INFORMATION FUSION, 2023, 91 : 504 - 514
  • [10] A multimodal temporal panorama approach for moving vehicle detection, reconstruction and classification
    Wang, Tao
    Zhu, Zhigang
    Taylor, Clark N.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (12) : 1724 - 1735