Neural Abstractive Summarization for Long Text and Multiple Tables

被引：0

作者：

Liu, Shuaiqi ^{[1
]}

Cao, Jiannong ^{[1
]}

Deng, Zhongfen ^{[2
]}

Zhao, Wenting ^{[2
]}

Yang, Ruosong ^{[1
]}

Wen, Zhiyuan ^{[1
]}

Yu, Philip S. ^{[2
]}

机构：

[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

[2] Univ Illinois, Chicago, IL 60607 USA

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2024年 / 36卷 / 06期

关键词：

Document summarization; natural language generation; natural language processing; text summarization;

D O I：

10.1109/TKDE.2023.3324012

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Abstractive summarization aims to generate a concise summary covering the input document's salient information. Within a report document, the salient information can be scattered in the textual and non-textual content. However, existing document summarization datasets and methods usually focus on the text and filter out the non-textual content. Missing tabular data can limit produced summaries' informativeness, especially when summaries require covering quantitative descriptions of critical metrics in tables. Existing datasets and methods cannot meet the requirements of summarizing long text and dozens of tables in each report document. To deal with the scarcity of available datasets, we propose FINDSum, the first large-scale dataset for long text and multi-table summarization. Built on 21,125 annual reports from 3,794 companies, FINDSum has two subsets for summarizing each company's results of operations and liquidity. Besides, we present four types of summarization methods to jointly consider text and table content when summarizing reports. Additionally, we propose a set of evaluation metrics to assess the usage of numerical information in produced summaries. Our summarization methods significantly outperform advanced baselines, which verifies the necessity of incorporating textual and tabular data when summarizing report documents. We also conduct extensive comparative experiments to identify vital model components and configurations that can improve summarization results.

引用

页码：2572 / 2586

页数：15

共 50 条

[21] Japanese abstractive text summarization using BERT
Iwasaki, Yuuki
Yamashita, Akihiro
Konno, Yoko
Matsubayashi, Katsushi
[J]. 2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2019,
[22] Unsupervised Abstractive Summarization of Bengali Text Documents
Chowdhury, Radia Rayan
Nayeem, Mir Tafseer
Mim, Tahsin Tasnim
Chowdhury, Md Saifur Rahman
Jannat, Taufiqul
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2612 - 2619
[23] Abstractive Text Summarization on Google Search Results
Patel, Dikshita
Shah, Nisarg
Shah, Vrushali
Hole, Varsha
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 538 - 543
[24] Reinforcement Learning Models for Abstractive Text Summarization
Buciumas, Sergiu
[J]. PROCEEDINGS OF THE 2019 ANNUAL ACM SOUTHEAST CONFERENCE (ACMSE 2019), 2019, : 270 - 271
[25] Abstractive Text Summarization Using Multimodal Information
Rafi, Shaik
Das, Ranjita
[J]. 2023 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2023, : 141 - 145
[26] Abstractive Text Summarization via Stacked LSTM
Siddhartha, Ireddy
Zhan, Huixin
Sheng, Victor S.
[J]. 2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 437 - 442
[27] Neural Abstractive Summarization with Structural Attention
Chowdhury, Tanya
Kumar, Sachin
Chakraborty, Tanmoy
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3716 - 3722
[28] Evaluating the Factual Consistency of Abstractive Text Summarization
Kryscinski, Wojciech
McCann, Bryan
Xiong, Caiming
Socher, Richard
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9332 - 9346
[29] Generative Adversarial Network for Abstractive Text Summarization
Liu, Linqing
Lu, Yao
Yang, Min
Qu, Qiang
Zhu, Jia
Li, Hongyan
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 8109 - 8110
[30] Highlighted Word Encoding for Abstractive Text Summarization
Lal, Daisy Monika
Singh, Krishna Pratap
Tiwary, Uma Shanker
[J]. INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 77 - 86

← 1 2 3 4 5 →