Utilizing passage-level relevance and kernel pooling for enhancing BERT-based document reranking

被引:0
|
作者
Pan, Min [1 ]
Zhou, Shuting [1 ]
Li, Teng [1 ]
Liu, Yu [1 ]
Pei, Quanli [2 ]
Huang, Angela J. [3 ]
Huang, Jimmy X. [2 ,4 ]
机构
[1] Hubei Normal Univ, Coll Comp & Informat Engn, Huangshi, Hubei, Peoples R China
[2] York Univ, Sch Informat Technol, Toronto, ON, Canada
[3] York Univ, Lassonde Sch Engn, Toronto, ON, Canada
[4] York Univ, Informat Retrieval & Knowledge Management Res Lab, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 湖北省教育厅重点项目; 中国国家自然科学基金;
关键词
document Re-ranking; information retrieval; neural language models; passage-level relevance;
D O I
10.1111/coin.12656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pre-trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state-of-the-art results in the field of Information Retrieval. Existing BERT-based ranking models divide documents into passages and aggregate passage-level relevance to rank the document list. However, these common score aggregation strategies cannot capture important semantic information such as document structure and have not been extensively studied. In this article, we propose a novel kernel-based score pooling system to capture document-level relevance by aggregating passage-level relevance. In particular, we propose and study several representative kernel pooling functions and several different document ranking strategies based on passage-level relevance. Our proposed framework KnBERT naturally incorporates kernel functions from the passage level into the BERT-based re-ranking method, which provides a promising avenue for building universal retrieval-then-rerank information retrieval systems. Experiments conducted on two widely used TREC Robust04 and GOV2 test datasets show that the KnBERT has made significant improvements over other BERT-based ranking approaches in terms of MAP, P@20, and NDCG@20 indicators with no extra or even less computations.
引用
收藏
页数:28
相关论文
共 3 条
  • [1] Investigating Passage-level Relevance and Its Role in Document-level Relevance Judgment
    Wu, Zhijing
    Mao, Jiaxin
    Liu, Yiqun
    Zhang, Min
    Ma, Shaoping
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 605 - 614
  • [2] DML: Dynamic Multi-Granularity Learning for BERT-Based Document Reranking
    Zhang, Xuanyu
    Yang, Qing
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3642 - 3646
  • [3] Incorporating window-based passage-level evidence in document retrieval
    Xi, WS
    Xu-Rong, R
    Khoo, CSG
    Lim, EP
    [J]. JOURNAL OF INFORMATION SCIENCE, 2001, 27 (02) : 73 - 80