A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

被引：29

作者：

Xie, Jiayi ^{[1
]}

Zhu, Yaochen ^{[1
]}

Zhang, Zhibin ^{[1
]}

Peng, Jian ^{[1
]}

Yi, Jing ^{[1
]}

Hu, Yaosi ^{[1
]}

Liu, Hongyi ^{[1
]}

Chen, Zhenzhong ^{[1
]}

机构：

[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan, Peoples R China

来源：

WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020) | 2020年

基金：

国家重点研发计划;

关键词：

Micro-video popularity prediction; Variational inference; Deep information bottleneck; Multimodal learning; Deep neural networks;

D O I：

10.1145/3366423.3380004

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Predicting the popularity of a micro-video is a challenging task, due to a number of factors impacting the distribution such as the diversity of the video content and user interests, complex online interactions, etc. In this paper, we propose a multimodal variational encoder-decoder (MMVED) framework that considers the uncertain factors as the randomness for the mapping from the multimodal features to the popularity. Specifically, the MMVED first encodes features from multiple modalities in the observation space into latent representations and learns their probability distributions based on variational inference, where only relevant features in the input modalities can be extracted into the latent representations. Then, the modality-specific hidden representations are fused through Bayesian reasoning such that the complementary information from all modalities is well utilized. Finally, a temporal decoder implemented as a recurrent neural network is designed to predict the popularity sequence of a certain micro-video. Experiments conducted on a real-world dataset demonstrate the effectiveness of our proposed model in the micro-video popularity prediction task.

引用

页码：2542 / 2548

页数：7

共 50 条

[1] Micro-Video Popularity Prediction Via Multimodal Variational Information Bottleneck
Xie, Jiayi
Zhu, Yaochen
Chen, Zhenzhong
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 24 - 37
[2] Variational Memory Encoder-Decoder
Hung Le
Truyen Tran
Thin Nguyen
Venkatesh, Svetha
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[3] Micro-Video Popularity Prediction with Bidirectional Deep Encoding Network
Jing Peiguang
Ye Xuqing
Liu Yu
Su Yuting
LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (08)
[4] Pedestrian trajectory prediction using BiRNN encoder-decoder framework*
Wu, Jiaxu
Woo, Hanwool
Tamura, Yusuke
Moro, Alessandro
Massaroli, Stefano
Yamashita, Atsushi
Asama, Hajime
ADVANCED ROBOTICS, 2019, 33 (18) : 956 - 969
[5] Micro-climate Prediction - Multi Scale Encoder-decoder based Deep Learning Framework
Kumar, Peeyush
Chandra, Ranveer
Bansal, Chetan
Kalyanaraman, Shivkumar
Ganu, Tanuja
Grant, Michael
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3128 - 3138
[6] Timber Tracing with Multimodal Encoder-Decoder Networks
Zolotarev, Fedor
Eerola, Tuomas
Lensu, Lasse
Kalviainen, Heikki
Haario, Heikki
Heikkinen, Jere
Kauppi, Tomi
COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT II, 2019, 11679 : 342 - 353
[7] A joint encoder-decoder error control framework for stereoscopic video coding
Xiang, Xinguang
Zhao, Debin
Wang, Qiang
Ma, Siwei
Gao, Wen
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2010, 21 (08) : 975 - 985
[8] Multimodal cooperative learning for micro-video advertising click prediction
Chen, Runyu
INTERNET RESEARCH, 2022, 32 (02) : 477 - 495
[9] Encoder-Decoder Joint Enhancement for Video Chat
Zhang, Zhenghao
Wang, Zhao
Ye, Yan
Wang, Shiqi
Zheng, Changwen
2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
[10] Skip-attention encoder-decoder framework for human motion prediction
Zhang, Ruipeng
Shu, Xiangbo
Yan, Rui
Zhang, Jiachao
Song, Yan
MULTIMEDIA SYSTEMS, 2022, 28 (02) : 413 - 422

← 1 2 3 4 5 →