An Entailment-based Scoring Method for Content Selection in Document Summarization

被引：2

作者：

Dang Hoang Long ^{[1
]}

Minh-Tien Nguyen ^{[2
]}

Ngo Xuan Bach ^{[1
]}

Le-Minh Nguyen ^{[3
]}

Tu Minh Phuong ^{[1
]}

机构：

[1] Posts & Telecommun Inst Technol, Hanoi, Vietnam

[2] Hung Yen Univ Technol & Educ, Hung Yen, Vietnam

[3] Japan Adv Inst Sci & Technol, 1-8 Asahidai, Nomi, Ishikawa, Japan

来源：

PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY (SOICT 2018) | 2018年

关键词：

Web Document Summarization; Entailment; Sentence Scoring; Integer Linear Programming (ILP);

D O I：

10.1145/3287921.3287976

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper introduces a scoring method to improve the quality of content selection in an extractive summarization system. Different from previous models mainly using local information inside sentences such as sentence position or sentence length, our method judges the importance of a sentence based on its own information and the relation between sentences. For the relation between sentences, we utilize textual entailment, a relationship indicating that the meaning of a sentence can be inferred from another one. Unlike previous work on using textual entailment for summarization, we go a step further by looking at aligned words in an entailment sentence pair. Assuming that important words in a salient sentence can be aligned by several words in other sentences, word alignment scores are exploited to compute the entailment score of a sentence. To take advantage of local and neighbor information for facilitating the salient estimation of sentences, we combine entailment scores with sentence position scores. We validate the proposed scoring method with greedy or integer linear programming approaches for extracting summaries. Experiments on three datasets (including DUC 2001 and 2002) in two different domains show that our model obtains competitive ROUGE-scores with state-of-the-art methods for single-document summarization.

引用

页码：122 / 129

页数：8

共 50 条

[21] Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem
Naserasadi, Ali
Khosravi, Hamid
Sadeghi, Faramarz
NATURAL LANGUAGE ENGINEERING, 2019, 25 (01) : 121 - 146
[22] Exploring content selection strategies for Multilingual Multi-Document Summarization based on the Universal Network Language (UNL)
Chaud, Matheus Rigobelo
Di Felippo, Ariani
REVISTA DE ESTUDOS DA LINGUAGEM, 2018, 26 (01) : 45 - 71
[23] Incorporating Textual Entailment Recognition in Single- and Multi-Document Summarization Systems
Lloret, Elena
Ferrandez, Oscar
Munoz, Rafael
Palomar, Manuel
PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 183 - 190
[24] A Scoring Model Assisted by Frequency for Multi-Document Summarization
Yu, Yue
Wu, Mutong
Su, Weifeng
Cheung, Yiu-ming
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 309 - 320
[25] A progressive sentence selection strategy for document summarization
Ouyang, You
Li, Wenjie
Zhang, Renxian
Li, Sujian
Lu, Qin
INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (01) : 213 - 221
[26] Optimizing Sentence Modeling and Selection for Document Summarization
Yin, Wenpeng
Pei, Yulong
PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1383 - 1389
[27] The Pyramid Method: Incorporating human content selection variation in summarization evaluation
Nenkova, Ani
Passonneau, Rebecca
Mckeown, Kathleen
ACM Transactions on Speech and Language Processing, 2007, 4 (02):
[28] Generalised Zero-shot Learning for Entailment-based Text Classification with Externa Knowledge
Wang, Yuqi
Wang, Wei
Chen, Qi
Huang, Kaizhu
Anh Nguyen
De, Suparna
2022 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING (SMARTCOMP 2022), 2022, : 19 - 25
[29] Cross-document Structure Theory (CST) Content Selection Strategies for Multidocument Automatic Summarization
Jorge, Maria Lucia del Rosario Castro
Salgueiro Pardo, Thiago Alexandre
LINGUAMATICA, 2010, 2 (01): : 95 - 109
[30] Building a Textual Entailment Suite for the Evaluation of Automatic Content Scoring Technologies
Sukkarieh, Jana Z.
Bolge, Eleanor
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3149 - 3156

← 1 2 3 4 5 →