Deep Multimodal Learning for Information Retrieval

被引:0
|
作者
Ji, Wei [1 ]
Wei, Yinwei [2 ]
Zheng, Zhedong [1 ]
Fei, Hao [1 ]
Chua, Tat-Seng [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Monash Univ, Clayton, Vic, Australia
关键词
Information retrieval; Multi-modal; CLIP;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information retrieval (IR) is a fundamental technique that aims to acquire information from a collection of documents, web pages, or other sources. While traditional text-based IR has achieved great success, the under-utilization of varied data sources in different modalities (i.e., text, images, audio, and video) would hinder IR techniques from giving its full advancement and thus limits the applications in the real world. Within recent years, the rapid development of deep multimodal learning paves the way for advancing IR with multi-modality. Benefiting from a variety of data types and modalities, some latest prevailing techniques are invented to show great facilitation in multi-modal and IR learning, such as CLIP, ChatGPT, GPT4, etc. In the context of IR, deep multi-modal learning has shown the prominent potential to improve the performance of retrieval systems, by enabling them to better understand and process the diverse types of data that they encounter. Given the great potential shown by multimodal-empowered IR, there can be still unsolved challenges and open questions in the related directions. With this workshop, we aim to provide a platform for discussion about multi-modal IR among scholars, practitioners, and other interested parties.
引用
收藏
页码:9739 / 9741
页数:3
相关论文
共 50 条
  • [1] Deep Learning for Information Retrieval
    Li, Hang
    Lu, Zhengdong
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 1203 - 1206
  • [2] Multimodal Deep Learning and Fast Retrieval for Recommendation
    Ciarlo, Daniele
    Portinale, Luigi
    FOUNDATIONS OF INTELLIGENT SYSTEMS (ISMIS 2022), 2022, 13515 : 52 - 60
  • [3] Deep Multimodal Learning for Affective Analysis and Retrieval
    Pang, Lei
    Zhu, Shiai
    Ngo, Chong-Wah
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 2008 - 2020
  • [4] Web Table Retrieval using Multimodal Deep Learning
    Shraga, Roee
    Roitman, Haggai
    Feigenblat, Guy
    Cannim, Mustafa
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1399 - 1408
  • [5] News video retrieval by learning multimodal semantic information
    Yu, Hui
    Su, Bolan
    Lu, Hong
    Xue, Xiangyang
    ADVANCES IN VISUAL INFORMATION SYSTEMS, 2007, 4781 : 403 - 414
  • [6] Improved Multimodal Deep Learning with Variation of Information
    Sohn, Kihyuk
    Shang, Wenling
    Lee, Honglak
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [7] How Deep Learning Works for Information Retrieval
    Tao, Dacheng
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 5 - 5
  • [8] Deep Learning for Robust Information Retrieval System
    Ouni, Achref
    Royer, Eric
    Chateau, Thierry
    Chevaldonne, Marc
    Dhome, Michel
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 425 - 436
  • [9] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Peng, Xi
    Goh, Rick Siow Mong
    Zhou, Joey Tianyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
  • [10] Scalable Deep Multimodal Learning for Cross-Modal Retrieval
    Hu, Peng
    Zhen, Liangli
    Peng, Dezhong
    Liu, Pei
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 635 - 644