WenLan: Efficient Large-Scale Multi-Modal Pre-Training on Real World Data

被引:0
|
作者
Song, Ruihua [1 ]
机构
[1] Renmin Univ, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
关键词
Multi-modal; pre-training models; image and text pairs; weak correlation assumption;
D O I
10.1145/3463945.3468120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
引用
收藏
页码:3 / 3
页数:1
相关论文
共 50 条
  • [1] Efficient Large-Scale Multi-Modal Classification
    Kiela, Douwe
    Grave, Edouard
    Joulin, Armand
    Mikolov, Tomas
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5198 - 5204
  • [2] Multi-Modal Contrastive Pre-training for Recommendation
    Liu, Zhuang
    Ma, Yunpu
    Schubert, Matthias
    Ouyang, Yuanxin
    Xiong, Zhang
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 99 - 108
  • [3] MULTI-MODAL PRE-TRAINING FOR AUTOMATED SPEECH RECOGNITION
    Chan, David M.
    Ghosh, Shalini
    Chakrabarty, Debmalya
    Hoffmeister, Bjorn
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 246 - 250
  • [4] MGeo: Multi-Modal Geographic Language Model Pre-Training
    Ding, Ruixue
    Chen, Boli
    Xie, Pengjun
    Huang, Fei
    Li, Xin
    Zhang, Qiang
    Xu, Yao
    [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 185 - 194
  • [5] TableVLM: Multi-modal Pre-training for Table Structure Recognition
    Chen, Leiyuan
    Huang, Chengsong
    Zheng, Xiaoqing
    Lin, Jinshu
    Huang, Xuanjing
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2437 - 2449
  • [6] Pre-training on Large-Scale Heterogeneous Graph
    Jiang, Xunqiang
    Jia, Tianrui
    Fang, Yuan
    Shi, Chuan
    Lin, Zhe
    Wang, Hui
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 756 - 766
  • [7] Real-time Emotion Pre-Recognition in Conversations with Contrastive Multi-modal Dialogue Pre-training
    Ju, Xincheng
    Zhang, Dong
    Zhu, Suyang
    Li, Junhui
    Li, Shoushan
    Zhou, Guodong
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1045 - 1055
  • [8] Multi-modal Masked Pre-training for Monocular Panoramic Depth Completion
    Yan, Zhiqiang
    Li, Xiang
    Wang, Kun
    Zhang, Zhenyu
    Li, Jun
    Yang, Jian
    [J]. COMPUTER VISION - ECCV 2022, PT I, 2022, 13661 : 378 - 395
  • [9] Versatile Multi-Modal Pre-Training for Human-Centric Perception
    Hong, Fangzhou
    Pan, Liang
    Cai, Zhongang
    Liu, Ziwei
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16135 - 16145
  • [10] Effective Classification for Multi-modal Behavioral Authentication on Large-Scale Data
    Yamaguchi, Shuji
    Gomi, Hidehito
    Kobayashi, Ryosuke
    Tran Phuong Thao
    Irvan, Mhd
    Yamaguchi, Rie Shigetomi
    [J]. 2020 15TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY (ASIAJCIS 2020), 2020, : 101 - 109