Discourse-Based Sentence Splitting

被引:0
|
作者
Cripwell, Liam [1 ]
Legrand, Joel [2 ]
Gardent, Claire [1 ]
机构
[1] Univ Lorraine, CNRS, LORIA, Nancy, France
[2] Univ Lorraine, CNRS, LORIA, Cent Supelec, Nancy, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentence splitting involves the segmentation of a sentence into two or more shorter sentences. It is a key component of sentence simplification, has been shown to help human comprehension and is a useful preprocessing step for NLP tasks such as summarisation and relation extraction. While several methods and datasets have been proposed for developing sentence splitting models, little attention has been paid to how sentence splitting interacts with discourse structure. In this work, we focus on cases where the input text contains a discourse connective, which we refer to as discourse-based sentence splitting. We create synthetic and organic datasets for discourse-based splitting and explore different ways of combining these datasets using different model architectures. We show that pipeline models which use discourse structure to mediate sentence splitting outperform end-to-end models in learning the various ways of expressing a discourse relation but generate text that is less grammatical; that large scale synthetic data provides a better basis for learning than smaller scale organic data; and that training on discourse-focused, rather than on general sentence splitting data provides a better basis for discourse splitting.
引用
收藏
页码:261 / 273
页数:13
相关论文
共 50 条
  • [1] Discourse-based pronoun resolution in non-native sentence processing
    Puebla, Cecilia
    Felser, Claudia
    [J]. BILINGUALISM-LANGUAGE AND COGNITION, 2023,
  • [2] DISCOFUSE: A Large-Scale Dataset for Discourse-Based Sentence Fusion
    Geva, Mor
    Malmi, Eric
    Szpektor, Idan
    Berant, Jonathan
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3443 - 3455
  • [3] EFFECTS OF CONTEXT IN HUMAN SENTENCE PARSING - EVIDENCE AGAINST A DISCOURSE-BASED PROPOSAL MECHANISM
    MITCHELL, DC
    CORLEY, MMB
    GARNHAM, A
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 1992, 18 (01) : 69 - 88
  • [4] A discourse-based Approach for the Semicolon
    Rothstein, Bjorn
    [J]. MUTTERSPRACHE, 2016, 126 (03): : 185 - 192
  • [5] A Discourse-based Chinese Chunkbank
    Lu L.
    Jiao H.-Y.
    Li M.
    Xun E.-D.
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (12): : 2911 - 2921
  • [6] Sentence splitting and discourse structure in translations
    Solfjeld, Kare
    [J]. LANGUAGES IN CONTRAST, 2008, 8 (01) : 21 - 46
  • [7] Internationalization of the firm: A discourse-based view
    Len J Treviño
    Jonathan P Doh
    [J]. Journal of International Business Studies, 2021, 52 : 1375 - 1393
  • [8] Internationalization of the firm: A discourse-based view
    Trevino, Len J.
    Doh, Jonathan P.
    [J]. JOURNAL OF INTERNATIONAL BUSINESS STUDIES, 2021, 52 (07) : 1375 - 1393
  • [9] Theoretical Foundations of Discourse Analysis in Discourse-based Context Analysis
    Zhao, Shu-Bo
    [J]. 3RD ANNUAL INTERNATIONAL CONFERENCE ON MODERN EDUCATION AND SOCIAL SCIENCE (MESS 2017), 2017, 135 : 123 - 125
  • [10] Discourse-based Interaction Models for Recommendation Processes
    Ertl, Dominik
    Kaindl, Hermann
    Arnautovic, Edin
    Falb, Juergen
    Popp, Roman
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER-HUMAN INTERACTIONS (ACHI 2011), 2011, : 12 - 15