Abstractive text summarization using deep learning with a new Turkish summarization benchmark dataset

被引:3
|
作者
Ertam, Fatih [1 ]
Aydin, Galip [2 ]
机构
[1] Firat Univ, Technol Fac, Dept Digital Forens Engn, Elazig, Turkey
[2] Firat Univ, Engn Fac, Dept Comp Engn, Elazig, Turkey
来源
关键词
abstract summarization; deep learning; information retrieval; text summarization; web scraping; FRAMEWORK; MODELS;
D O I
10.1002/cpe.6482
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Exponential increase in the amount of textual data made available on the Internet results in new challenges in terms of accessing information accurately and quickly. Text summarization can be defined as reducing the dimensions of the expressions to be summarized without spoiling the meaning. Summarization can be performed as extractive and abstractive or using both together. In this study, we focus on abstractive summarization which can produce more human-like summarization results. For the study we created a Turkish news summarization benchmark dataset from various news agency web portals by crawling the news title, short news, news content, and keywords for the last 5 years. The dataset is made publicly available for researchers. The deep learning network training was carried out by using the news headlines and short news contents from the prepared dataset and then the network was expected to create the news headline as the short news summary. To evaluate the performance of this study, Rouge-1, Rouge-2, and Rouge-L were compared using precision, sensitivity and F1 measure scores. Performance values for the study were presented for each sentence as well as by averaging the results for 50 randomly selected sentences. The F1 Measure values are 0.4317, 0.2194, and 0.4334 for Rouge-1, Rouge-2, and Rouge-L respectively. Performance results show that the approach is promising for Turkish text summarization studies and the prepared dataset will add value to the literature.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] SummScreen: A Dataset for Abstractive Screenplay Summarization
    Chen, Mingda
    Chu, Zewei
    Wiseman, Sam
    Gimpel, Kevin
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8602 - 8615
  • [32] Exploring Abstractive Text Summarization: Methods, Dataset, Evaluation, and Emerging Challenges
    Sunusi, Yusuf
    Omar, Nazlia
    Zakaria, Lailatul Qadri
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 1340 - 1357
  • [33] Social-sum-Mal: A Dataset for Abstractive Text Summarization in Malayalam
    Rahul, Raj M.
    Pankaj, Dhanya S
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (11)
  • [34] Text summarization using unsupervised deep learning
    Yousefi-Azar, Mahmood
    Hamey, Len
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 68 : 93 - 105
  • [35] Extractive Text Summarization using Deep Learning
    Shirwandkar, Nikhil S.
    Kulkarni, Samidha
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [36] CATAMARAN: A Cross-lingual Long Text Abstractive Summarization Dataset
    Chen, Zheng
    Lin, Hongyu
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6932 - 6937
  • [37] Abstractive Text Summarization Using Enhanced Attention Model
    Roul, Rajendra Kumar
    Joshi, Pratik Madhav
    Sahoo, Jajati Keshari
    INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 63 - 76
  • [38] ATSSI: Abstractive Text Summarization using Sentiment Infusion
    Bhargava, Rupal
    Sharma, Yashvardhan
    Sharma, Gargi
    TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 404 - 411
  • [39] Abstractive Multi-document Summarization Using Deep Learning Approaches
    Poornima, Murkute
    Pulipati, Venkateswara Rao
    Kumar, T. Sunil
    PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND COMMUNICATION SYSTEMS, ICACECS 2021, 2022, : 57 - 68
  • [40] Dual Encoding for Abstractive Text Summarization
    Yao, Kaichun
    Zhang, Libo
    Du, Dawei
    Luo, Tiejian
    Tao, Lili
    Wu, Yanjun
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (03) : 985 - 996