Multi-Modal Learning: Study on A Large-Scale Micro-Video Data Collection

被引:12
|
作者
Chen, Jingyuan [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore, Singapore
关键词
Micro-Videos; Popularity Prediction; Venue Estimation; Multi-Modal Learning;
D O I
10.1145/2964284.2971477
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Micro-video sharing social services, as a new phenomenon in social media, enable users to share micro-videos and thus gain increasing enthusiasm among people. One distinct characteristic of micro-videos is the multi-modality, as these videos always have visual signals, audio tracks, textual descriptions as well as social clues. Such multi-modality data makes it possible to obtain a comprehensive understanding of videos and hence provides new opportunities for researchers. However, limited efforts thus far have been dedicated to this new emerging user-generated contents (UGCs) due to the lack of large-scale benchmark dataset. Towards this end, in this paper, we construct a large-scale micro-video dataset, which can support many research domains, such as popularity prediction and venue estimation. Based upon this dataset, we conduct an initial study in popularity prediction of micro-videos. Finally, we identify our future work.
引用
收藏
页码:1454 / 1458
页数:5
相关论文
共 50 条
  • [21] MMpedia: A Large-Scale Multi-modal Knowledge Graph
    Wu, Yinan
    Wu, Xiaowei
    Li, Junwen
    Zhang, Yue
    Wang, Haofen
    Du, Wen
    He, Zhidong
    Liu, Jingping
    Ruan, Tong
    SEMANTIC WEB, ISWC 2023, PT II, 2023, 14266 : 18 - 37
  • [22] Multi-modal sequence model with gated fully convolutional blocks for micro-video venue classification
    Liu, Wei
    Huang, Xianglin
    Cao, Gang
    Zhang, Jianglong
    Song, Gege
    Yang, Lifang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) : 6709 - 6726
  • [23] Robust Multi-Modal Policies for Industrial Assembly via Reinforcement Learning and Demonstrations: A Large-Scale Study
    Luo, Jianlan
    Sushkov, Oleg
    Pevceviciute, Rugile
    Lian, Wenzhao
    Su, Chang
    Vecerik, Mel
    Ye, Ning
    Schaal, Stefan
    Scholz, Jon
    ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [24] Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation
    Zeng, Zhaoyang
    Luo, Yongsheng
    Liu, Zhenhua
    Rao, Fengyun
    Li, Dian
    Guo, Weidong
    Wen, Zhen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3128 - 3137
  • [25] SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations
    Lei, Chenyi
    Liu, Yong
    Zhang, Lingzi
    Wang, Guoxin
    Tang, Haihong
    Li, Houqiang
    Miao, Chunyan
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3161 - 3171
  • [26] Exploring a large-scale multi-modal transportation recommendation system
    Liu, Yang
    Lyu, Cheng
    Liu, Zhiyuan
    Cao, Jinde
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 126
  • [27] Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph
    Wang, Meng
    Wang, Haofen
    Qi, Guilin
    Zheng, Qiushuo
    BIG DATA RESEARCH, 2020, 22 (22)
  • [28] Operational planning of a large-scale multi-modal transportation system
    Jansen, B
    Swinkels, PCJ
    Teeuwen, GJA
    de Fluiter, BV
    Fleuren, HA
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2004, 156 (01) : 41 - 53
  • [29] Integrating multi-modal content analysis and hyperbolic visualization for large-scale news video retrieval and exploration
    Luo, H.
    Fan, J.
    Satoh, S.
    Yang, J.
    Ribarsky, W.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2008, 23 (07) : 538 - 553
  • [30] WenLan: Efficient Large-Scale Multi-Modal Pre-Training on Real World Data
    Song, Ruihua
    MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 3 - 3