Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking

被引:0
|
作者
Zhi-Jing Wu
Yi-Qun Liu
Jia-Xin Mao
Min Zhang
Shao-Ping Ma
机构
[1] Tsinghua University,Department of Computer Science and Technology
[2] Tsinghua University,Beijing National Research Center for Information Science and Technology
[3] Renmin University of China,Gaoling School of Artificial Intelligence
关键词
document ranking; neural network; passage cumulative gain;
D O I
暂无
中图分类号
学科分类号
摘要
Document ranking is one of the most studied but challenging problems in information retrieval (IR). More and more studies have begun to address this problem from fine-grained document modeling. However, most of them focus on context-independent passage-level relevance signals and ignore the context information. In this paper, we investigate how information gain accumulates with passages and propose the context-aware Passage Cumulative Gain (PCG). The fine-grained PCG avoids the need to split documents into independent passages. We investigate PCG patterns at the document level (DPCG) and the query level (QPCG). Based on the patterns, we propose a BERT-based sequential model called Passage-level Cumulative Gain Model (PCGM) and show that PCGM can effectively predict PCG sequences. Finally, we apply PCGM to the document ranking task using two approaches. The first one is leveraging DPCG sequences to estimate the gain of an individual document. Experimental results on two public ad hoc retrieval datasets show that PCGM outperforms most existing ranking models. The second one considers the cross-document effects and leverages QPCG sequences to estimate the marginal relevance. Experimental results show that predicted results are highly consistent with users’ preferences. We believe that this work contributes to improving ranking performance and providing more explainability for document ranking.
引用
收藏
页码:814 / 838
页数:24
相关论文
共 50 条
  • [41] Entity and Evidence Guided Document-Level Relation Extraction
    Huang, Kevin
    Qi, Peng
    Wang, Guangtao
    Ma, Tengyu
    Huang, Jing
    REPL4NLP 2021: PROCEEDINGS OF THE 6TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2021, : 307 - 315
  • [42] G-Transformer for Document-level Machine Translation
    Bao, Guangsheng
    Zhang, Yue
    Teng, Zhiyang
    Chen, Boxing
    Luo, Weihua
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3442 - 3455
  • [43] Exploiting Ubiquitous Mentions for Document-Level Relation Extraction
    Zhang, Ruoyu
    Li, Yanzeng
    Zhang, Minhao
    Zou, Lei
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1986 - 1990
  • [44] Exploring Discourse Structure in Document-level Machine Translation
    Hu, Xinyu
    Wan, Xiaojun
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13889 - 13902
  • [45] Document-Level Machine Translation with Large Language Models
    Wang, Longyue
    Lyu, Chenyang
    Ji, Tianbo
    Zhang, Zhirui
    Yu, Dian
    Shi, Shuming
    Tu, Zhaopeng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16646 - 16661
  • [46] DocTime: A Document-level Temporal Dependency Graph Parser
    Mathur, Puneet
    Morariu, Vlad, I
    Kaynig-Fittkau, Verena
    Gu, Jiuxiang
    Dernoncourt, Franck
    Quan Hung Tran
    Nenkova, Ani
    Manocha, Dinesh
    Jain, Rajiv
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 993 - 1009
  • [47] Scaling Law for Document-Level Neural Machine Translation
    Zhang, Zhuocheng
    Gu, Shuhao
    Zhang, Min
    Feng, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8290 - 8303
  • [48] LSTM with sentence representations for document-level sentiment classification
    Rao, Guozheng
    Huang, Weihang
    Feng, Zhiyong
    Cong, Qiong
    NEUROCOMPUTING, 2018, 308 : 49 - 57
  • [49] Automatic Graph Generation for Document-Level Relation Extraction
    Yu, Yanhua
    Shen, Fangting
    Yang, Shengli
    Li, Jie
    Wang, Yuling
    Ma, Ang
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [50] Few-Shot Document-Level Relation Extraction
    Popovic, Nicholas
    Faerber, Michael
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5733 - 5746