Nucleus Composition in Transition-based Dependency Parsing

被引:1
|
作者
Nivre, Joakim [1 ]
Basirat, Ali [2 ]
Duerich, Luise [1 ]
Moss, Adam [3 ]
机构
[1] Uppsala Univ, RISE Res Inst Sweden, Dept Linguist & Philol, Uppsala, Sweden
[2] Linkoping Univ, Dept Comp & Informat Sci, Linkoping, Sweden
[3] Uppsala Univ, Dept Linguist & Philol, Uppsala, Sweden
基金
瑞典研究理事会;
关键词
Syntactics;
D O I
10.1162/coli_a_00450
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dependency-based approaches to syntactic analysis assume that syntactic structure can be analyzed in terms of binary asymmetric dependency relations holding between elementary syntactic units. Computational models for dependency parsing almost universally assume that an elementary syntactic unit is a word, while the influential theory of Lucien Tesniere instead posits a more abstract notion of nucleus, which may be realized as one or more words. In this article, we investigate the effect of enriching computational parsing models with a concept of nucleus inspired by Tesniere. We begin by reviewing how the concept of nucleus can be defined in the framework of Universal Dependencies, which has become the de facto standard for training and evaluating supervised dependency parsers, and explaining how composition functions can be used to make neural transition-based dependency parsers aware of the nuclei thus defined. We then perform an extensive experimental study, using data from 20 languages to assess the impact of nucleus composition across languages with different typological characteristics, and utilizing a variety of analytical tools including ablation, linear mixed-effects models, diagnostic classifiers, and dimensionality reduction. The analysis reveals that nucleus composition gives small but consistent improvements in parsing accuracy for most languages, and that the improvement mainly concerns the analysis of main predicates, nominal dependents, clausal dependents, and coordination structures. Significant factors explaining the rate of improvement across languages include entropy in coordination structures and frequency of certain function words, in particular determiners. Analysis using dimensionality reduction and diagnostic classifiers suggests that nucleus composition increases the similarity of vectors representing nuclei of the same syntactic type.
引用
收藏
页码:849 / 886
页数:38
相关论文
共 50 条
  • [31] An accurate transformer-based model for transition-based dependency parsing of free word order languages
    Zuhra, Fatima Tuz
    Saleem, Khalid
    Naz, Surayya
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (06)
  • [32] Deep ContextualizedWord Embeddings in Transition-Based and Graph-Based Dependency Parsing - A Tale of Two Parsers Revisited"
    Kulmizev, Artur
    de Lhoneux, Miryam
    Gontrum, Johannes
    Fano, Elena
    Nivre, Joakim
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2755 - 2768
  • [33] Inducing and Using Alignments for Transition-based AMR Parsing
    Drozdov, Andrew
    Zhou, Jiawei
    Florian, Radu
    McCallum, Andrew
    Naseem, Tahira
    Kim, Yoon
    Astudillo, Ramon Fernandez
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1086 - 1098
  • [34] Structured Training for Neural Network Transition-Based Parsing
    Weiss, David
    Alberti, Chris
    Collins, Michael
    Petrov, Slav
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 323 - 333
  • [35] Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning
    Naseem, Tahira
    Shah, Abhishek
    Wan, Hui
    Florian, Radu
    Roukos, Salim
    Ballesteros, Miguel
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4586 - 4592
  • [36] A Transition-based Approach for AMR Parsing using LSTM Networks
    Cimpian, Silviana
    Lazar, Andreea
    Macicasan, Florin
    Lemnaru, Camelia
    2017 13TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2017, : 103 - 110
  • [37] Inherent Dependency Displacement Bias of Transition-Based Algorithms
    Anderson, Mark
    Gomez-Rodriguez, Carlos
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5147 - 5155
  • [38] Better Transition-Based AMR Parsing with a Refined Search Space
    Guo, Zhijiang
    Lu, Wei
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1712 - 1722
  • [39] Joint POS Tagging and Dependence Parsing With Transition-Based Neural Networks
    Yang, Liner
    Zhang, Meishan
    Liu, Yang
    Sun, Maosong
    Yu, Nan
    Fu, Guohong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1352 - 1358
  • [40] Boosting Transition-based AMR Parsing with Refined Actions and Auxiliary Analyzers
    Wang, Chuan
    Xue, Nianwen
    Pradhan, Sameer
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 857 - 862