Unified Training for Cross-Lingual Abstractive Summarization by Aligning Parallel Machine Translation Pairs

被引:0
|
作者
Cheng, Shaohuan [1 ]
Chen, Wenyu [1 ]
Tang, Yujia [1 ]
Fu, Mingsheng [1 ]
Qu, Hong [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
关键词
cross-lingual summarization; multi-task learning; machine translation; low-resource scenario;
D O I
10.3390/math12132107
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Cross-lingual summarization (CLS) is essential for enhancing global communication by facilitating efficient information exchange across different languages. However, owing to the scarcity of CLS data, recent studies have employed multi-task frameworks to combine parallel monolingual summaries. These methods often use independent decoders or models with non-shared parameters because of the mismatch in output languages, which limits the transfer of knowledge between CLS and its parallel data. To address this issue, we propose a unified training method for CLS that combines parallel machine translation (MT) pairs with CLS pairs, jointly training them within a single model. This design ensures consistent input and output languages and promotes knowledge sharing between the two tasks. To further enhance the model's capability to focus on key information, we introduce two additional loss terms to align the hidden representations and probability distributions between the parallel MT and CLS pairs. Experimental results demonstrate that our method outperforms competitive methods in both full-dataset and low-resource scenarios on two benchmark datasets, Zh2EnSum and En2ZhSum.
引用
下载
收藏
页数:16
相关论文
共 50 条
  • [21] Unsupervised multilingual machine translation with pretrained cross-lingual encoders
    Shen, Yingli
    Bao, Wei
    Gao, Ge
    Zhou, Maoke
    Zhao, Xiaobing
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [22] Cross-Lingual Ontology Mapping - An Investigation of the Impact of Machine Translation
    Fu, Bo
    Brennan, Rob
    O'Sullivan, Declan
    SEMANTIC WEB, PROCEEDINGS, 2009, 5926 : 1 - +
  • [23] The negative effect of machine translation on Cross-Lingual Question Answering
    Ferrandez, Sergio
    Ferrandez, Antonio
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 494 - +
  • [24] Cross-lingual Supervision Improves Unsupervised Neural Machine Translation
    Wang, Mingxuan
    Bai, Hongxiao
    Zhao, Hai
    Li, Lei
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 89 - 96
  • [25] Cross-Lingual Speech-to-Text Summarization
    Pontes, Elvys Linhares
    Gonzalez-Gallardo, Carlos-Emiliano
    Torres-Moreno, Juan-Manuel
    Huet, Stephane
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, 2019, 833 : 385 - 395
  • [26] Towards Unifying Multi-Lingual and Cross-Lingual Summarization
    Wang, Jiaan
    Meng, Fandong
    Zheng, Duo
    Liang, Yunlong
    Li, Zhixu
    Qu, Jianfeng
    Zhou, Jie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15127 - 15143
  • [27] Cross-lingual extreme summarization of scholarly documents
    Takeshita, Sotaro
    Green, Tommaso
    Friedrich, Niklas
    Eckert, Kai
    Ponzetto, Simone Paolo
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2024, 25 (02) : 249 - 271
  • [28] A Cross-Lingual Summarization method based on cross-lingual Fact-relationship Graph Generation
    Zhang, Yongbing
    Gao, Shengxiang
    Huang, Yuxin
    Tan, Kaiwen
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 146
  • [29] MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization
    Cao, Yue
    Wan, Xiaojun
    Yao, Jin-ge
    Yu, Dian
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11 - 18
  • [30] Cross-lingual training of summarization systems using annotated corpora in a foreign language
    Marina Litvak
    Mark Last
    Information Retrieval, 2013, 16 : 629 - 656