CHEAVD: a Chinese natural emotional audio-visual database

被引:64
|
作者
Li, Ya [1 ]
Tao, Jianhua [1 ,2 ,3 ]
Chao, Linlin [1 ]
Bao, Wei [1 ,4 ]
Liu, Yazhu [1 ,4 ]
机构
[1] Chinese Acad Sci, NLPR, Inst Automat, Beijing, Peoples R China
[2] Chinese Acad Sci, CAS Ctr Excellence Brain Sci & Intelligence Techn, Inst Automat, Beijing, Peoples R China
[3] Chinese Acad Sci, Sch Comp & Control Engn, Grad Univ, Beijing, Peoples R China
[4] Jiangsu Normal Univ, Inst Linguist Sci, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 中国国家社会科学基金;
关键词
Audio-visual database; Natural emotion; Corpus annotation; LSTM; Multimodal emotion recognition; RECOGNITION; SPEECH; EXPRESSION; FEATURES; MODEL;
D O I
10.1007/s12652-016-0406-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a recently collected natural, multimodal, rich-annotated emotion database, CASIA Chinese Natural Emotional Audio-Visual Database (CHEAVD), which aims to provide a basic resource for the research on multimodal multimedia interaction. This corpus contains 140 min emotional segments extracted from films, TV plays and talk shows. 238 speakers, aging from child to elderly, constitute broad coverage of speaker diversity, which makes this database a valuable addition to the existing emotional databases. In total, 26 non-prototypical emotional states, including the basic six, are labeled by four native speakers. In contrast to other existing emotional databases, we provide multi-emotion labels and fake/suppressed emotion labels. To our best knowledge, this database is the first large-scale Chinese natural emotion corpus dealing with multimodal and natural emotion, and free to research use. Automatic emotion recognition with Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) is performed on this corpus. Experiments show that an average accuracy of 56 % could be achieved on six major emotion states.
引用
收藏
页码:913 / 924
页数:12
相关论文
共 50 条
  • [1] CHEAVD: a Chinese natural emotional audio–visual database
    Ya Li
    Jianhua Tao
    Linlin Chao
    Wei Bao
    Yazhu Liu
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2017, 8 : 913 - 924
  • [2] BUILDING A CHINESE NATURAL EMOTIONAL AUDIO-VISUAL DATABASE
    Bao, Wei
    Li, Ya
    Gu, Mingliang
    Yang, Minghao
    Li, Hao
    Chao, Linlin
    Tao, Jianhua
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 583 - 587
  • [3] A Turkish Audio-Visual Emotional Database
    Onder, Onur
    Zhalehpour, Sara
    Erdem, Cigdem Eroglu
    [J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [4] METHODS AND CHALLENGES FOR CREATING AN EMOTIONAL AUDIO-VISUAL DATABASE
    Pandharipande, Meghna A.
    Chakraborty, Rupayan
    Kopparapu, Sunil Kumar
    [J]. 2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 183 - 188
  • [5] Searching Audio-Visual Clips for Dual-mode Chinese Emotional Speech Database
    Zhang, Xudong
    Wu, Guoqing
    Ren, Fuji
    [J]. 2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [6] The Dysarthric Expressed Emotional Database (DEED): An audio-visual database in British English
    Alhinti, Lubna
    Cunningham, Stuart
    Christensen, Heidi
    [J]. PLOS ONE, 2023, 18 (08):
  • [7] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    [J]. INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [8] Audio-Visual Twins Database
    Li, Jing
    Zhang, Li
    Guo, Dong
    Zhuo, Shaojie
    Sim, Terence
    [J]. 2015 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2015, : 493 - 500
  • [9] THE VERA AM MITTAG GERMAN AUDIO-VISUAL EMOTIONAL SPEECH DATABASE
    Grimm, Michael
    Kroschel, Kristian
    Narayanan, Shrikanth
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 865 - +
  • [10] SUTAV: A Turkish Audio-Visual Database
    Topkaya, Ibrahim Saygin
    Erdogan, Hakan
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2334 - 2337