Retrieval-Augmented Multiple Instance Learning

被引:0
|
作者
Cui, Yufei [1 ,2 ]
Liu, Ziquan [3 ]
Chen, Yixin [4 ]
Lu, Yuchen [1 ,5 ]
Yu, Xinyue [1 ,5 ]
Liu, Xue [1 ,2 ]
Kuo, Tei-Wei [6 ,7 ]
Rodrigues, Miguel R. D. [3 ]
Xue, Chun Jason [4 ]
Chan, Antoni B. [4 ]
机构
[1] Mila, Milan, Italy
[2] McGill Univ, Montreal, PQ, Canada
[3] UCL, London, England
[4] City Univ Hong Kong, Hong Kong, Peoples R China
[5] Univ Montreal, Montreal, PQ, Canada
[6] Natl Taiwan Univ, Taipei, Taiwan
[7] MBZUAI, Abu Dhabi, U Arab Emirates
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple Instance Learning (MIL) is a crucial weakly supervised learning method applied across various domains, e.g., medical diagnosis based on whole slide images (WSIs). Recent advancements in MIL algorithms have yielded exceptional performance when the training and test data originate from the same domain, such as WSIs obtained from the same hospital. However, this paper reveals a performance deterioration of MIL models when tested on an out-of-domain test set, exemplified by WSIs sourced from a novel hospital. To address this challenge, this paper introduces the Retrieval-AugMented MIL (RAM-MIL) framework, which integrates Optimal Transport (OT) as the distance metric for nearest neighbor retrieval. The development of RAM-MIL is driven by two key insights. First, a theoretical discovery indicates that reducing the input's intrinsic dimension can minimize the approximation error in attention-based MIL. Second, previous studies highlight a link between input intrinsic dimension and the feature merging process with the retrieved data. Empirical evaluations conducted on WSI classification demonstrate that the proposed RAM-MIL framework achieves state-of-the-art performance in both in-domain scenarios, where the training and retrieval data are in the same domain, and more crucially, in out-of-domain scenarios, where the (unlabeled) retrieval data originates from a different domain. Furthermore, the use of the transportation matrix derived from OT renders the retrieval results interpretable at the instance level, in contrast to the vanilla l(2) distance, and allows for visualization for human experts.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Learning Customized Visual Models with Retrieval-Augmented Knowledge
    Liu, Haotian
    Son, Kilho
    Yang, Jianwei
    Liu, Ce
    Gao, Jianfeng
    Lee, Yong Jae
    Li, Chunyuan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15148 - 15158
  • [2] Retrieval-augmented Image Captioning
    Ramos, Rita
    Elliott, Desmond
    Martins, Bruno
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3666 - 3681
  • [3] Evaluating Retrieval Quality in Retrieval-Augmented Generation
    Salemi, Alireza
    Zamani, Hamed
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2395 - 2400
  • [4] Retrieval-Augmented Transformer for Image Captioning
    Sarto, Sara
    Cornia, Marcella
    Baraldi, Lorenzo
    Cucchiara, Rita
    [J]. 19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 1 - 7
  • [5] Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
    Chen, Xiang
    Li, Lei
    Zhang, Ningyu
    Liang, Xiaozhuan
    Deng, Shumin
    Tan, Chuanqi
    Huang, Fei
    Si, Luo
    Chen, Huajun
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Retrieval-Augmented Audio Deepfake Detection
    Kang, Zuheng
    He, Yayun
    Zhao, Botao
    Qu, Xiaoyang
    Peng, Junqing
    Xiao, Jing
    Wang, Jianzong
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 376 - 384
  • [7] In-Context Retrieval-Augmented Language Models
    Ram, Ori
    Levine, Yoav
    Dalmedigos, Itay
    Muhlgay, Dor
    Shashua, Amnon
    Leyton-Brown, Kevin
    Shoham, Yoav
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1316 - 1331
  • [8] ReACC: A Retrieval-Augmented Code Completion Framework
    Lu, Shuai
    Duan, Nan
    Han, Hojae
    Guo, Daya
    Hwang, Seung-won
    Svyatkovskiy, Alexey
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6227 - 6240
  • [9] ReACC: A Retrieval-Augmented Code Completion Framework
    Lu, Shuai
    Duan, Nan
    Han, Hojae
    Guo, Daya
    Hwang, Seung-Won
    Svyatkovskiy, Alexey
    [J]. Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2022, 1 : 6227 - 6240
  • [10] Recent Advances in Retrieval-Augmented Text Generation
    Cai, Deng
    Wang, Yan
    Liu, Lemao
    Shi, Shuming
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3417 - 3419