I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval

被引:4
|
作者
Dong, Qian [1 ,2 ,3 ]
Liu, Yiding [4 ]
Ai, Qingyao [1 ,2 ,3 ]
Li, Haitao [1 ,2 ,3 ]
Wang, Shuaiqiang [4 ]
Liu, Yiqun [1 ,2 ,3 ]
Yin, Dawei [4 ]
Ma, Shaoping [1 ,2 ,3 ]
机构
[1] Tsinghua Univ, DCST, Beijing, Peoples R China
[2] Quan Cheng Lab, Beijing, Peoples R China
[3] Zhongguancun Lab, Beijing, Peoples R China
[4] Baidu Inc, Beijing, Peoples R China
关键词
Learning to Rank; Language models; Semantic Matching;
D O I
10.1145/3583780.3614923
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Passage retrieval is a fundamental task in many information systems, such as web search and question answering, where both efficiency and effectiveness are critical concerns. In recent years, neural retrievers based on pre-trained language models (PLM), such as dual-encoders, have achieved huge success. Yet, studies have found that the performance of dual-encoders are often limited due to the neglecting of the interaction information between queries and candidate passages. Therefore, various interaction paradigms have been proposed to improve the performance of vanilla dualencoders. Particularly, recent state-of-the-art methods often introduce late-interaction during the model inference process. However, such late-interaction based methods usually bring extensive computation and storage cost on large corpus. Despite their effectiveness, the concern of efficiency and space footprint is still an important factor that limits the application of interaction-based neural retrieval models. To tackle this issue, we Incorporate Implicit Interaction into dual-encoders, and propose I3 retriever. In particular, our implicit interaction paradigm leverages generated pseudo-queries to simulate query-passage interaction, which jointly optimizes with query and passage encoders in an end-to-end manner. It can be fully pre-computed and cached, and its inference process only involves simple dot product operation of the query vector and passage vector, which makes it as efficient as the vanilla dual encoders. We conduct comprehensive experiments on MSMARCO and TREC2019 Deep Learning Datasets, demonstrating the I-3 retriever's superiority in terms of both effectiveness and efficiency. Moreover, the proposed implicit interaction is compatible with special pre-training and knowledge distillation for passage retrieval, which brings a new state-of-the-art performance. The codes are available at https://github.com/Deriq-Qian-Dong/III-Retriever.
引用
收藏
页码:441 / 451
页数:11
相关论文
共 50 条
  • [1] Incorporating Explicit Knowledge in Pre-trained Language Models for Passage Re-ranking
    Dong, Qian
    Liu, Yiding
    Cheng, Suqi
    Wang, Shuaiqiang
    Cheng, Zhicong
    Niu, Shuzi
    Yin, Dawei
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1490 - 1501
  • [2] MarkedBERT: Integrating Traditional IR Cues in Pre-trained Language Models for Passage Retrieval
    Boualili, Lila
    Moreno, Jose G.
    Boughanem, Mohand
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1977 - 1980
  • [3] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [4] Somun: entity-centric summarization incorporating pre-trained language models
    Inan, Emrah
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 5301 - 5311
  • [5] Somun: entity-centric summarization incorporating pre-trained language models
    Emrah Inan
    Neural Computing and Applications, 2021, 33 : 5301 - 5311
  • [6] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
    Zhang, Jianyi
    Muhamed, Aashiq
    Anantharaman, Aditya
    Wang, Guoyin
    Chen, Changyou
    Zhong, Kai
    Cui, Qingjun
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    Chen, Yiran
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
  • [7] Annotating Columns with Pre-trained Language Models
    Suhara, Yoshihiko
    Li, Jinfeng
    Li, Yuliang
    Zhang, Dan
    Demiralp, Cagatay
    Chen, Chen
    Tan, Wang-Chiew
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
  • [8] LaoPLM: Pre-trained Language Models for Lao
    Lin, Nankai
    Fu, Yingwen
    Yang, Ziyu
    Chen, Chuwei
    Jiang, Shengyi
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
  • [9] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [10] Deciphering Stereotypes in Pre-Trained Language Models
    Ma, Weicheng
    Scheible, Henry
    Wang, Brian
    Veeramachaneni, Goutham
    Chowdhary, Pratim
    Sung, Alan
    Koulogeorge, Andrew
    Wang, Lili
    Yang, Diyi
    Vosoughi, Soroush
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11328 - 11345