Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

被引:0
|
作者
Zhi-Jing Wu
Yi-Qun Liu
Jia-Xin Mao
Min Zhang
Shao-Ping Ma
机构
[1] Tsinghua University,Department of Computer Science and Technology
[2] Tsinghua University,Beijing National Research Center for Information Science and Technology
[3] Renmin University of China,Gaoling School of Artificial Intelligence
关键词
document ranking; neural network; passage cumulative gain;
D O I
暂无
中图分类号
学科分类号
摘要
Document ranking is one of the most studied but challenging problems in information retrieval (IR). More and more studies have begun to address this problem from fine-grained document modeling. However, most of them focus on context-independent passage-level relevance signals and ignore the context information. In this paper, we investigate how information gain accumulates with passages and propose the context-aware Passage Cumulative Gain (PCG). The fine-grained PCG avoids the need to split documents into independent passages. We investigate PCG patterns at the document level (DPCG) and the query level (QPCG). Based on the patterns, we propose a BERT-based sequential model called Passage-level Cumulative Gain Model (PCGM) and show that PCGM can effectively predict PCG sequences. Finally, we apply PCGM to the document ranking task using two approaches. The first one is leveraging DPCG sequences to estimate the gain of an individual document. Experimental results on two public ad hoc retrieval datasets show that PCGM outperforms most existing ranking models. The second one considers the cross-document effects and leverages QPCG sequences to estimate the marginal relevance. Experimental results show that predicted results are highly consistent with users’ preferences. We believe that this work contributes to improving ranking performance and providing more explainability for document ranking.
引用
收藏
页码:814 / 838
页数:24
相关论文
共 50 条
  • [31] Corpora for Document-Level Neural Machine Translation
    Liu, Siyou
    Zhang, Xiaojun
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3775 - 3781
  • [32] Document-level Relation Extraction as Semantic Segmentation
    Zhang, Ningyu
    Chen, Xiang
    Xie, Xin
    Deng, Shumin
    Tan, Chuanqi
    Chen, Mosha
    Huang, Fei
    Si, Luo
    Chen, Huajun
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3999 - 4006
  • [33] Probing Representations for Document-level Event Extraction
    Wang, Barry
    Due, Xinya
    Cardie, Claire
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12675 - 12683
  • [34] Improving the Recognition of Names by Document-Level Clustering
    Zhang, Bin
    Wu, Wei
    Kahn, Jeremy G.
    Ostendorf, Mari
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1023 - +
  • [35] Generalization Bounds for Ranking Algorithm via Query-Level Stabilities Analysis
    Jia, Zhiyang
    Gao, Wei
    He, Xiangguang
    INFORMATION AND MANAGEMENT ENGINEERING, PT VI, 2011, 236 : 197 - +
  • [36] TRIP: Accelerating Document-level Multilingual Pre-training via Triangular Document-level Pretraining on Parallel Data Triplets
    Lu, Hongyuan
    Huang, Haoyang
    Ma, Shuming
    Zhang, Dongdong
    Lam, Wai
    Gao, Zhaochuan
    Aue, Anthony
    Menezes, Arul
    Wei, Furu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7845 - 7858
  • [37] Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation
    Wu, Minghao
    Foster, George
    Qu, Lizhen
    Haffari, Gholamreza
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 448 - 462
  • [38] Document-level relation extraction with global and path dependencies
    Jia, Wei
    Ma, Ruizhe
    Yan, Li
    Niu, Weinan
    Ma, Zongmin
    KNOWLEDGE-BASED SYSTEMS, 2024, 289
  • [39] On Search Strategies for Document-Level Neural Machine Translation
    Herold, Christian
    Ney, Hermann
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12827 - 12836
  • [40] Inter span learning for document-level relation extraction
    Liao, Tao
    Sun, Haojie
    Zhang, Shunxiang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 9965 - 9977