Exploiting Evidential Theory in the Fusion of Textual, Audio, and Visual Modalities for Affective Music Video Retrieval

被引:0
|
作者
Nemati, Shahla [1 ]
Naghsh-Nilchi, Ahmad Reza [2 ]
机构
[1] Shahrekord Univ, Dept Comp Engn, Fac Engn, Shahrekord, Iran
[2] Univ Isfahan, Fac Comp Engn, Dept Artifital Intelligent, Esfahan, Iran
关键词
Affective music video retrieval; Lexicon-based sentiment analysis; Information fusion; Emotion detection; SENTIMENT ANALYSIS; FRAMEWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Developing techniques to retrieve video contents with regard to their impact on viewers' emotions is the main goal of affective video retrieval systems. Existing systems mainly apply a multimodal approach that fuses information from different modalities to specify the affect category. In this paper, the effect of exploiting two types of textual information to enrich the audio-visual content of music video is evaluated; subtitles or songs' lyrics and texts obtained from viewers' comments in video sharing websites. In order to specify the emotional content of texts, an unsupervised lexicon-based method is applied. This method does not need any human-coded corpus for training and is much faster than supervised approach. In order to integrate these modalities, a new information fusion method is proposed based on the Dempster-Shafer theory of evidence. Experiments are conducted on the video clips of DEAP dataset and their associated viewers' comments on YouTube. Results show that incorporating songs' lyrics with the audio-visual content has no positive effect on the retrieval performance, whereas exploiting viewers' comments significantly improves the affective retrieval system. This could be justified by the fact that viewers' affective responses depend not only on the video itself but also on its context.
引用
收藏
页码:222 / 228
页数:7
相关论文
共 50 条
  • [1] An evidential data fusion method for affective music video retrieval
    Nemati, Shahla
    Naghsh-Nilchi, Ahmad Reza
    [J]. INTELLIGENT DATA ANALYSIS, 2017, 21 (02) : 427 - 441
  • [2] Exploiting multiple modalities for interactive video retrieval
    Christel, MG
    Huang, C
    Moraveji, N
    Papernick, N
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 1032 - 1035
  • [3] Affective Visualization and Retrieval for Music Video
    Zhang, Shiliang
    Huang, Qingming
    Jiang, Shuqiang
    Gao, Wen
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (06) : 510 - 522
  • [4] Visual versus Textual Embedding for Video Retrieval
    Francis, Danny
    Pidou, Paul
    Merialdo, Bernard
    Huet, Benoit
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017), 2017, 10617 : 386 - 395
  • [5] Fusion of Audio and Video Modalities for Detection of Acoustic Events
    Butko, Taras
    Temko, Andrey
    Nadeu, Climent
    Canton, Cristian
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 123 - 126
  • [6] Audio visual cues for video indexing and retrieval
    Muneesawang, P
    Amin, T
    Guan, L
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
  • [7] Towards Fusion of Textual and Visual Modalities for Describing Audiovisual Documents
    Fourati, Manel
    Jedidi, Anis
    Ben Hassin, Hanen
    Gargouri, Faiez
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2015, 6 (02): : 52 - 70
  • [8] Audio-Visual-Based Query by Example Video Retrieval
    Hou, Sujuan
    Zhou, Shangbo
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [9] Exploiting Visual Semantic Reasoning for Video-Text Retrieval
    Feng, Zerun
    Zeng, Zhimin
    Guo, Caili
    Li, Zheng
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1005 - 1011
  • [10] Audio-Visual Embedding for Cross-Modal Music Video Retrieval through Supervised Deep CCA
    Zeng, Donghuo
    Yu, Yi
    Oyama, Keizo
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 143 - 150