Splitting Complex Sentences for Natural Language Processing Applications: Building a Simplified Spanish Corpus

被引:8
|
作者
Camacho Collados, Jose [1 ]
机构
[1] Univ Autonoma Barcelona, Barcelona 08290, Spain
关键词
text simplification; syntactic simplification; parallel corpus; spanish; natural language processing;
D O I
10.1016/j.sbspro.2013.10.670
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
This paper presents a new Spanish parallel corpus of original and syntactically simplified texts. The simplification carried out basically consists of opportunistically splitting a complex original sentence into several simple ones. This parallel corpus is envisioned as a first step in order to create an automatic syntactic simplification system to be used as a preprocessing tool for other Natural Language Processing tasks such as Text Summarization, Information Extraction, parsing or Machine Translation. The corpus has been evaluated by human annotators regarding its grammaticality and preservation of meaning. The results suggest that the meaning of simplified and original sentences is almost identical. (C) 2013 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:464 / 472
页数:9
相关论文
共 50 条
  • [41] Applications of natural language processing tools in the surgical journey
    Le, Khang Duy Ricky
    Tay, Samuel Boon Ping
    Choy, Kay Tai
    Verjans, Johan
    Sasanelli, Nicola
    Kong, Joseph C. H.
    FRONTIERS IN SURGERY, 2024, 11
  • [42] Sketching Transformed Matrices with Applications to Natural Language Processing
    Liang, Yingyu
    Song, Zhao
    Wang, Mengdi
    Yang, Lin F.
    Yang, Xin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 467 - 480
  • [43] UNLization of Punjabi text for natural language processing applications
    Agarwal, Vaibhav
    Kumar, Parteek
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (06):
  • [44] Applications of Advanced Natural Language Processing for Clinical Pharmacology
    Hsu, Joy C.
    Wu, Michael
    Kim, Chloe
    Vora, Bianca
    Lien, Yi Ting
    Jindal, Ashutosh
    Yoshida, Kenta
    Kawakatsu, Sonoko
    Gore, Jeremy
    Jin, Jin Y.
    Lu, Christina
    Chen, Bingyuan
    Wu, Benjamin
    CLINICAL PHARMACOLOGY & THERAPEUTICS, 2024, 115 (04) : 786 - 794
  • [45] On Natural Language Processing Applications for Military Dialect Classification
    Gunasekara, Charith
    Carryer, Tobias
    Triff, Matt
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 211 - 218
  • [46] Applications of natural language processing in ophthalmology: present and future
    Chen, Jimmy S.
    Baxter, Sally L.
    FRONTIERS IN MEDICINE, 2022, 9
  • [47] Accelerating Natural Language Processing for Applications in Pharmaceutical Research
    Torfs, Bert
    WMSCI 2008: 12TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL I, PROCEEDINGS, 2008, : 145 - 149
  • [48] Applications of natural language processing in radiology: A systematic review
    Linna, Nathaniel
    Kahn, Charles E., Jr.
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 163
  • [49] UNLization of Punjabi text for natural language processing applications
    Vaibhav Agarwal
    Parteek Kumar
    Sādhanā, 2018, 43
  • [50] Recent advances in natural language processing for biomedical applications
    Collier, Nigel
    Nazarenko, Adeline
    Baud, Robert
    Ruch, Patrick
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2006, 75 (06) : 413 - 417