共 50 条
- [1] LayoutLM: Pre-training of Text and Layout for Document Image Understanding [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1192 - 1200
- [2] LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2579 - 2591
- [3] WUKONG- READER: Multi-modal Pre-training for Fine-grained Visual Document Understanding [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13386 - 13401
- [4] Multi-Modal Contrastive Pre-training for Recommendation [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 99 - 108
- [6] Graph-Text Multi-Modal Pre-training for Medical Representation Learning [J]. CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 174, 2022, 174 : 261 - 281
- [7] MMPT'21: International JointWorkshop on Multi-Modal Pre-Training for Multimedia Understanding [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 694 - 695
- [8] MULTI-MODAL PRE-TRAINING FOR AUTOMATED SPEECH RECOGNITION [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 246 - 250
- [9] MGeo: Multi-Modal Geographic Language Model Pre-Training [J]. PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 185 - 194
- [10] TableVLM: Multi-modal Pre-training for Table Structure Recognition [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2437 - 2449