DEEP LEARNING FOR MULTIMODAL-BASED VIDEO INTERESTINGNESS PREDICTION

被引:0
|
作者
Shen, Yuesong [1 ]
Demarty, Claire-Helene [2 ]
Duong, Ngoc Q. K. [2 ]
机构
[1] Tech Univ Munich, Munich, Germany
[2] Tech, Rennes, France
关键词
Video interestingness prediction; social interestingness; content interestingness; multimodal fusion; deep neural network (DNN); MediaEval; 2016;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Predicting interestingness of media content remains an important, but challenging research subject. The difficulty comes first from the fact that, besides being a high-level semantic concept, interestingness is highly subjective and its global definition has not been agreed yet. This paper presents the use of up-to-date deep learning techniques for solving the task. We perform experiments with both social-driven (i.e., Flickr videos) and content-driven (i.e., videos from the MediaEval 2016 interestingness task) datasets. To account for the temporal aspect and multimodality of videos, we tested various deep neural network (DNN) architectures, including a new combination of several recurrent neural networks (RNNs), to handle several temporal samples at the same time. We then investigated different strategies for dealing with unbalanced datasets. Multimodality, as the mid-level fusion of audio and visual information, brought benefit to the task. We also established that social interestingness differs from content interestingness.
引用
收藏
页码:1003 / 1008
页数:6
相关论文
共 50 条
  • [1] Deep Multimodal Features for Movie Genre and Interestingness Prediction
    Ben-Ahmed, Olfa
    Huet, Benoit
    [J]. 2018 16TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2018,
  • [2] A Knowledge Augmented and Multimodal-Based Framework for Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Lu, Shao-Ping
    Yang, Yulu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [3] Multimodal-Based and Aesthetic-Guided Narrative Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Zhang, Tianyi
    Zhang, Yixuan
    Lu, Shao-Ping
    Cesar, Pablo
    Yang, Yulu
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4894 - 4908
  • [4] Video Interestingness Prediction Based on Ranking Model
    Wang, Shuai
    Chen, Shizhe
    Zhao, Jinming
    Jin, Qin
    [J]. PROCEEDINGS OF THE JOINT WORKSHOP OF THE 4TH WORKSHOP ON AFFECTIVE SOCIAL MULTIMEDIA COMPUTING AND FIRST MULTI-MODAL AFFECTIVE COMPUTING OF LARGE-SCALE MULTIMEDIA DATA (ASMMC-MMAC'18), 2018, : 55 - 61
  • [5] Multimodal-Based Supervised Learning for Image Search Reranking
    Zhao, Shengnan
    Ma, Jun
    Cui, Chaoran
    [J]. WEB-AGE INFORMATION MANAGEMENT (WAIM 2015), 2015, 9098 : 135 - 147
  • [6] Video semantic concept discovery using multimodal-based association classification
    Lin, Lin
    Ravitz, Guy
    Shyu, Mei-Ling
    Chen, Shu-Ching
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 859 - +
  • [7] Hotel Stock Prediction Based on Multimodal Deep Learning
    Liu, Yang
    Zhang, Wen
    Hu, Yi
    Mao, Jin
    Huang, Fei
    [J]. Data Analysis and Knowledge Discovery, 2023, 7 (05) : 21 - 32
  • [8] Research on the Application of Multimodal-Based Machine Learning Algorithms to Water Quality Classification
    Xin, Lei
    Mou, Tianyu
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [9] Multimodal deep representation learning for video classification
    Haiman Tian
    Yudong Tao
    Samira Pouyanfar
    Shu-Ching Chen
    Mei-Ling Shyu
    [J]. World Wide Web, 2019, 22 : 1325 - 1341
  • [10] Multimodal deep representation learning for video classification
    Tian, Haiman
    Tao, Yudong
    Pouyanfar, Samira
    Chen, Shu-Ching
    Shyu, Mei-Ling
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (03): : 1325 - 1341