Unsupervised Statistical Text Simplification

被引:8
|
作者
Qiang, Jipeng [1 ]
Wu, Xindong [2 ,3 ]
机构
[1] Yangzhou Univ, Dept Comp Sci, Yangzhou 225127, Jiangsu, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 10084, Anhui, Peoples R China
[3] Mininglamp Acad Sci, Minininglamp, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Encyclopedias; Electronic publishing; Internet; Benchmark testing; Standards; Mathematical model; Text simplification; machine translation; unsupervised;
D O I
10.1109/TKDE.2019.2947679
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most recent approaches for Text Simplification (TS) have drawn on insights from machine translation to learn simplification rewrites from the monolingual parallel corpus of complex and simple sentences, yet their effectiveness strongly relies on large amounts of parallel sentences. However, there has been a serious problem haunting TS for decades, that is, the availability of parallel TS corpora is scarce or not fit for the learning task. In this paper, we will focus on one especially useful and challenging problem of unsupervised TS without a single parallel sentence. To the best of our knowledge, we present the first unsupervised text simplification system based on phrase-based machine translation system, which leverages a careful initialization of phrase tables and language models. On the widely used WikiLarge and WikiSmall benchmarks, our system respectively obtains 39.08 and 25.12 SARI points, even outperforms some supervised baselines.
引用
收藏
页码:1802 / 1806
页数:5
相关论文
共 50 条
  • [1] Unsupervised Neural Text Simplification
    Surya, Sai
    Mishra, Abhijit
    Laha, Anirban
    Jain, Parag
    Sankaranarayanan, Karthik
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2058 - 2068
  • [2] Unsupervised statistical text simplification using pre-trained language modeling for initialization
    Jipeng Qiang
    Feng Zhang
    Yun Li
    Yunhao Yuan
    Yi Zhu
    Xindong Wu
    [J]. Frontiers of Computer Science, 2023, 17
  • [3] Unsupervised statistical text simplification using pre-trained language modeling for initialization
    Qiang, Jipeng
    Zhang, Feng
    Li, Yun
    Yuan, Yunhao
    Zhu, Yi
    Wu, Xindong
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (01)
  • [4] SimplifyUR: Unsupervised Lexical Text Simplification for Urdu
    Qasmi, Namoos Hayat
    Bin Zia, Haris
    Athar, Awais
    Raza, Agha Ali
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3484 - 3489
  • [5] Improving text simplification by corpus expansion with unsupervised learning
    Katsuta, Akihiro
    Yamamoto, Kazuhide
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 216 - 221
  • [6] Automatic Text Simplification
    Wan, Xiaojun
    [J]. COMPUTATIONAL LINGUISTICS, 2018, 44 (04) : 659 - 661
  • [7] Investigating Text Simplification Evaluation
    Vasquez-Rodriguez, Laura
    Shardlow, Matthew
    Przybyla, Piotr
    Ananiadou, Sophia
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 876 - 882
  • [8] Challenging Choices for Text Simplification
    Gasperin, Caroline
    Maziero, Erick
    Aluisio, Sandra M.
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2010, 6001 : 40 - 50
  • [9] Text Simplification and Eye Tracking
    Shojaeizadeh, Mina
    Djamasbi, Soussan
    Rochford, John
    DaBoll-Lavoie, Abigail
    Greff, Tyler
    Lally, Jennifer
    McAvoy, Kayla
    [J]. AMCIS 2016 PROCEEDINGS, 2016,
  • [10] Evaluating Factuality in Text Simplification
    Devaraj, Ashwin
    Sheffield, William
    Wallace, Byron C.
    Li, Junyi Jessy
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7331 - 7345