Automatic Extractive Summarization using GAN Boosted by DistilBERT Word Embedding and Transductive Learning

被引：0

作者：

Li, Dongliang ^{[1
]}

Li, Youyou ^{[1
]}

Zhang, Zhigang ^{[1
]}

机构：

[1] Jiaozuo Univ, Coll Artificial Intelligence, Jiaozuo 454000, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2023年 / 14卷 / 11期

关键词：

-Extractive text summarization; generative adversarial network; transductive learning; long short-term memory; DistilBERT; LSTM; NETWORKS;

D O I：

10.14569/IJACSA.2023.0141107

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

summarization is crucial in diverse fields such as engineering and healthcare, greatly enhancing time and cost efficiency. This study introduces an innovative extractive text summarization approach utilizing a Generative Adversarial (TLSTM), and DistilBERT word embedding. DistilBERT, a streamlined BERT variant, offers significant size reduction (approximately 40%), while maintaining 97% of language comprehension capabilities and achieving a 60% speed increase. These benefits are realized through knowledge distillation during pre-training. Our methodology uses GANs, consisting of the generator and discriminator networks, built primarily using TLSTM - an expert at decoding temporal nuances in timeseries prediction. For more effective model fitting, transductive learning is employed, assigning higher weights to samples nearer to the test point. The generator evaluates the probability of each sentence for inclusion in the summary, and the discriminator critically examines the generated summary. This reciprocal relationship fosters a dynamic iterative process, generating toptier summaries. To train the discriminator efficiently, a unique loss function is proposed, incorporating multiple factors such as the generator's output, actual document summaries, and artificially created summaries. This strategy motivates the generator to experiment with diverse sentence combinations, generating summaries that meet high-quality and coherence standards. Our model's effectiveness was tested on the widely accepted CNN/Daily Mail dataset, a benchmark for summarization tasks. According to the ROUGE metric, our experiments demonstrate that our model outperforms existing models in terms of summarization quality and efficiency.

引用

页码：61 / 74

页数：14

共 50 条

[1] Extractive Text Summarization using Word Vector Embedding
Jain, Aditya
Bhatia, Divij
Thakur, Manish K.
[J]. 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 51 - 55
[2] Extractive Arabic Text Summarization Using PageRank and Word Embedding
Alselwi, Ghadir
Tasci, Tugrul
[J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 13115 - 13130
[3] Extractive Myanmar News Summarization Using Centroid Based Word Embedding
Lwin, Soe Soe
Nwet, Khin Thandar
[J]. 2019 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGIES (ICAIT), 2019, : 200 - 205
[4] Dilated convolution for enhanced extractive summarization: A GAN-based approach with BERT word embedding
Wu, Huimin
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 4777 - 4790
[5] A weighted word embedding based approach for extractive text summarization
Rani, Ruby
Lobiyal, Daya K.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
[6] Learning bilingual word embedding for automatic text summarization in low resource language
Wijayanti, Rini
Khodra, Masayu Leylia
Surendro, Kridanto
Widyantoro, Dwi H.
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (04) : 224 - 235
[7] Automatic Image Annotation using Word Embedding Learning
Chen, Qi
Yip, Andy M.
Tan, Chew Lim
[J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 269 - 276
[8] A Framework for Word Embedding Based Automatic Text Summarization and Evaluation
Hailu, Tulu Tilahun
Yu, Junqing
Fantaye, Tessfu Geteye
[J]. INFORMATION, 2020, 11 (02)
[9] Text document summarization using word embedding
Mohd, Mudasir
Jan, Rafiya
Shah, Muzaffar
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143
[10] Word-sentence co-ranking for automatic extractive text summarization
Fang, Changjian
Mu, Dejun
Deng, Zhenghong
Wu, Zhiang
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 189 - 195

← 1 2 3 4 5 →