Abstractive Summarization of Broadcast News Stories for Estonian

被引:1
|
作者
Harm, Henry [1 ]
Alumae, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Software Sci, Tallinn, Estonia
来源
BALTIC JOURNAL OF MODERN COMPUTING | 2022年 / 10卷 / 03期
关键词
Abstractive summarization; low-resource languages; pre-trained models; multilingual models; machine-translation;
D O I
10.22364/bjmc.2022.10.3.23
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present an approach for generating abstractive summaries for Estonian spoken news stories in a low-resource setting. Given a recording of a radio news story, the goal is to create a summary that captures the essential information in a short format. The approach consists of two steps: automatically generating the transcript and applying a state-of-the-art text summarization system to generate the result. We evaluated a number of models, with the best-performing model leveraging the large English BART model pre-trained on CNN/DailyMail dataset and fine-tuned on machine-translated in-domain data, and with the test data translated to English and back. The method achieved a ROUGE-1 score of 17.22, improving on the alternatives and achieving the best result in human evaluation. The applicability of the proposed solution might be limited in languages where machine translation systems are not mature. In such cases multilingual BART should be considered, which achieved a ROUGE-1 score of 17.00 overall and a score of 16.22 without machine translation based data augmentation.
引用
收藏
页码:511 / 524
页数:14
相关论文
共 50 条
  • [1] Abstractive Summarizers Become Emotional on News Summarization
    Ahuir, Vicent
    Gonzalez, Jose-Angel
    Hurtado, Lluis-F.
    Segarra, Encarna
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [2] Semantic Similarity Based Evaluation for Abstractive News Summarization
    Fikri, Figen Beken
    Oflazer, Kemal
    Yanikoglu, Berrin
    [J]. 1ST WORKSHOP ON NATURAL LANGUAGE GENERATION, EVALUATION, AND METRICS (GEM 2021), 2021, : 24 - 33
  • [3] Abstractive Text Summarization with Application to Bulgarian News Articles
    Taushanov, Nikola
    Koychev, Ivan
    Nakov, Preslav
    [J]. PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA (CLIB '18), 2018, : 15 - 22
  • [4] Abstractive Web News Summarization Using Knowledge Graphs
    Lakshika, M. V. P. T.
    Caldera, H. A.
    Welgama, W., V
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 300 - 301
  • [5] Deep Learning Based Abstractive Turkish News Summarization
    Karakoc, Enise
    Yilmaz, Burcu
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [6] Benchmarking Abstractive Models for Italian Legal News Summarization
    Benedetto, Irene
    Cagliero, Luca
    Tarasconi, Francesco
    Giacalone, Giuseppe
    Bernini, Claudia
    [J]. LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 379 : 311 - 316
  • [7] Automatically Discarding Straplines to Improve Data Quality for Abstractive News Summarization
    Keleg, Amr
    Lindemann, Matthias
    Liu, Danyang
    Long, Wanqiu
    Webber, Bonnie L.
    [J]. PROCEEDINGS OF THE FIRST WORKSHOP ON EFFICIENT BENCHMARKING IN NLP (NLP POWER 2022), 2022, : 42 - 51
  • [8] EFFECTS OF UPBEAT STORIES IN BROADCAST NEWS
    ZILLMANN, D
    GIBSON, R
    ORDMAN, VL
    AUST, CF
    [J]. JOURNAL OF BROADCASTING & ELECTRONIC MEDIA, 1994, 38 (01) : 65 - 78
  • [9] Leveraging Lead Bias for Zero-shot Abstractive News Summarization
    Zhu, Chenguang
    Yang, Ziyi
    Gmyr, Robert
    Zeng, Michael
    Huang, Xuedong
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1462 - 1471
  • [10] Legal public opinion news abstractive summarization by incorporating topic information
    Yuxin Huang
    Zhengtao Yu
    Junjun Guo
    Zhiqiang Yu
    Yantuan Xian
    [J]. International Journal of Machine Learning and Cybernetics, 2020, 11 : 2039 - 2050