CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization

被引:6
|
作者
Kulkarni, Sayali [1 ]
Chammas, Sheide [1 ]
Zhu, Wan [1 ]
Sha, Fei [1 ]
Ie, Eugene [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
关键词
Extractive summarization; Abstractive summarization; Neural models; Transformers; Summarization dataset;
D O I
10.1007/978-3-030-86331-9_6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document summarization compress source document (s) into succinct and information-preserving text. A variant of this is query-based multi-document summarization (qmps) that targets summaries to providing specific informational needs, contextualized to the query. However, the progress in this is hindered by limited availability to large-scale datasets. In this work, we make two contributions. First, we propose an approach for automatically generated dataset for both extractive and abstractive summaries and release a version publicly. Second, we design a neural model SIBERT for extractive summarization that exploits the hierarchical nature of the input. It also infuses queries to extract query-specific summaries. We evaluate this model on CoMSum dataset showing significant improvement in performance. This should provide a baseline and enable using CoMSum for future research on qMDS.
引用
收藏
页码:84 / 98
页数:15
相关论文
共 50 条
  • [41] Multi-document summarization based on the Yago ontology
    Baralis, Elena
    Cagliero, Luca
    Jabeen, Saima
    Fiori, Alessandro
    Shah, Sajid
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (17) : 6976 - 6984
  • [42] ESUM: An Efficient System for Query-Specific Multi-document Summarization
    Chowdary, C. Ravindranath
    Kumar, P. Sreenivasa
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 724 - 728
  • [43] Rhetorics-based multi-document summarization
    Atkinson, John
    Munoz, Ricardo
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (11) : 4346 - 4352
  • [44] Multi-document Summarization Based on Sentence Clustering
    Zheng, Hai-Tao
    Gong, Shu-Qin
    Chen, Hao
    Jiang, Yong
    Xia, Shu-Tao
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT II, 2014, 8835 : 429 - 436
  • [45] Multi-document summarization based on concept space
    Tang, STK
    Yen, J
    Yang, CC
    ITRE2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: RESEARCH AND EDUCATION, 2003, : 385 - 389
  • [46] Multi-document summarization based on cohesion with disambiguation
    Chen, Yanmin
    Lou, Xizhong
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 2, PROCEEDINGS, 2007, : 232 - +
  • [47] MULTI-DOCUMENT VIDEO SUMMARIZATION
    Wang, Feng
    Merialdo, Bernard
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1326 - 1329
  • [48] Query-oriented Unsupervised Multi-document Summarization on Big Data
    Sunaina
    Kamath, Sowmya S.
    7TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT 2016), 2016,
  • [49] A document-sensitive graph model for multi-document summarization
    Furu Wei
    Wenjie Li
    Qin Lu
    Yanxiang He
    Knowledge and Information Systems, 2010, 22 : 245 - 259
  • [50] MS2: A Dataset for Multi-Document Summarization of Medical Studies
    DeYonng, Jay
    Beltagy, Iz
    van Zuylen, Madeleine
    Kuehl, Bailey
    Wang, Lucy Lu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7494 - 7513