Splitting Complex Sentences for Natural Language Processing Applications: Building a Simplified Spanish Corpus

被引:8
|
作者
Camacho Collados, Jose [1 ]
机构
[1] Univ Autonoma Barcelona, Barcelona 08290, Spain
关键词
text simplification; syntactic simplification; parallel corpus; spanish; natural language processing;
D O I
10.1016/j.sbspro.2013.10.670
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
This paper presents a new Spanish parallel corpus of original and syntactically simplified texts. The simplification carried out basically consists of opportunistically splitting a complex original sentence into several simple ones. This parallel corpus is envisioned as a first step in order to create an automatic syntactic simplification system to be used as a preprocessing tool for other Natural Language Processing tasks such as Text Summarization, Information Extraction, parsing or Machine Translation. The corpus has been evaluated by human annotators regarding its grammaticality and preservation of meaning. The results suggest that the meaning of simplified and original sentences is almost identical. (C) 2013 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:464 / 472
页数:9
相关论文
共 50 条
  • [1] Building Natural Language Processing Applications with EasyNLP
    Wang, Chengyu
    Qiu, Minghui
    Huang, Jun
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 5100 - 5101
  • [2] Applications of Natural Language Processing in the Retrieval of Spanish Information
    Vilares Ferro, Jesus
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2006, (36): : 57 - 58
  • [3] Natural Language Processing for Corpus Linguistics
    Schmuck, Hanna
    Dunn, J.
    INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, 2024, 29 (01) : 123 - 129
  • [4] A computational linguistic approach to natural language processing with applications to garden path sentences analysis
    Du Jia-li
    Yu Ping-fang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2012, 3 (09) : 61 - 75
  • [5] Review of Natural Language Processing for Corpus Linguistics
    Zhao, Qiuying
    CORPUS PRAGMATICS, 2022, 6 (04) : 311 - 314
  • [6] Natural language processing for learner corpus research
    Kyle, Kristopher
    INTERNATIONAL JOURNAL OF LEARNER CORPUS RESEARCH, 2021, 7 (01) : 1 - 16
  • [7] Applications of natural language processing
    Blandon Andrade, Juan Carlos
    ENTRE CIENCIA E INGENIERIA, 2022, 16 (31): : 7 - 8
  • [8] Natural Language Processing for Corpus Linguistics by Jonathan Dunn
    Wen, Ju
    Yi, Lan
    NATURAL LANGUAGE ENGINEERING, 2023, 29 (03) : 842 - 845
  • [9] Applications of natural language processing in construction
    Ding, Yuexiong
    Ma, Jie
    Luo, Xiaowei
    AUTOMATION IN CONSTRUCTION, 2022, 136
  • [10] A new approach in building a corpus for natural language generation systems
    Galindo, MDB
    de Cea, GA
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2001, 2004 : 216 - 225