WebCiteS: Attributed Query-Focused Summarization on ChineseWeb Search Results with Citations

被引:0
|
作者
Deng, Haolin [1 ]
Wang, Chang [3 ]
Li, Xin [3 ]
Yuan, Dezhang [3 ]
Zhan, Junlang [3 ]
Zhou, Tianhua [3 ]
Ma, Jin [4 ]
Gao, Jun [1 ]
Xu, Ruifeng [1 ,2 ,5 ]
机构
[1] Harbin Inst Technol, Shenzhen, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
[3] Tencent Inc, Shenzhen, Peoples R China
[4] Univ Sci & Technol China, Hefei, Peoples R China
[5] Guangdong Prov Key Lab Novel Secur Intelligence T, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Enhancing the attribution in large language models (LLMs) is a crucial task. One feasible approach is to enable LLMs to cite external sources that support their generations. However, existing datasets and evaluation methods in this domain still exhibit notable limitations. In this work, we formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset featuring 7k human-annotated summaries with citations. WebCiteS derives from real-world user queries and web search results, offering a valuable resource for model training and evaluation. Prior works in attribution evaluation do not differentiate between groundedness errors and citation errors. They also fall short in automatically verifying sentences that draw partial support from multiple sources. We tackle these issues by developing detailed metrics and enabling the automatic evaluator to decompose the sentences into sub-claims for fine-grained verification. Our comprehensive evaluation of both open-source and proprietary models on WebCiteS highlights the challenge LLMs face in correctly citing sources, underscoring the necessity for further improvement.(1)
引用
收藏
页码:15095 / 15114
页数:20
相关论文
共 50 条
  • [41] Exploiting relevance, coverage, and novelty for query-focused multi-document summarization
    Luo, Wenjuan
    Zhuang, Fuzhen
    He, Qing
    Shi, Zhongzhi
    KNOWLEDGE-BASED SYSTEMS, 2013, 46 : 33 - 42
  • [42] Using query expansion in graph-based approach for query-focused multi-document summarization
    Zhao, Lin
    Wu, Lide
    Huang, Xuanjing
    INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (01) : 35 - 41
  • [43] Exploring actor-object relationships for query-focused multi-document summarization
    Valizadeh, Mohammadreza
    Brazdil, Pavel
    SOFT COMPUTING, 2015, 19 (11) : 3109 - 3121
  • [44] Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization
    Laskar, Md Tahmid Rahman
    Hoque, Enamul
    Huang, Jimmy Xiangji
    COMPUTATIONAL LINGUISTICS, 2022, 48 (02) : 279 - 320
  • [45] Query-focused multi-document summarization using hypergraph-based ranking
    Xiong, Shufeng
    Ji, Donghong
    INFORMATION PROCESSING & MANAGEMENT, 2016, 52 (04) : 670 - 681
  • [46] Long-Span Language Models for Query-Focused Unsupervised Extractive Text Summarization
    Singh, Mittul
    Mishra, Arunav
    Oualil, Youssef
    Berberich, Klaus
    Klakow, Dietrich
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 657 - 664
  • [47] The Automated Estimation of Content-Terms for Query-Focused Multi-document Summarization
    He, Tingting
    Shao, Wei
    Li, Fang
    Yang, Zongkai
    Ma, Liang
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 5, PROCEEDINGS, 2008, : 580 - +
  • [48] Nonfactoid Question Answering as Query-Focused Summarization With Graph-Enhanced Multihop Inference
    Deng, Yang
    Zhang, Wenxuan
    Xu, Weiwen
    Shen, Ying
    Lam, Wai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11231 - 11245
  • [49] Co-HITS-Ranking Based Query-Focused Multi-document Summarization
    Hu, Po
    Ji, Donghong
    Teng, Chong
    INFORMATION RETRIEVAL TECHNOLOGY, 2010, 6458 : 121 - 130
  • [50] Unsupervised Query-Focused Multi-Document Summarization using the Cross Entropy Method
    Feigenblat, Guy
    Roitman, Haggai
    Boni, Odellia
    Konopnicki, David
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 961 - 964