Data-driven dependency parsing of Vedic Sanskrit

被引:1
|
作者
Hellwig, Oliver [1 ,2 ]
Nehrdich, Sebastian [1 ,3 ]
Sellmer, Sven [1 ,4 ]
机构
[1] Heinrich Heine Univ Dusseldorf, Inst Linguist, Dusseldorf, Germany
[2] Univ Zurich, Dept Comparat Language Sci, Zurich, Switzerland
[3] Univ Hamburg, Khyentse Ctr Tibetan Buddhist Textual Scholarship, Hamburg, Germany
[4] Adam Mickiewicz Univ, Inst Oriental Studies, Poznan, Poland
关键词
Vedic Sanskrit; Dependency parsing; Low-resource languages; Contextual embeddings; GREEK;
D O I
10.1007/s10579-023-09636-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper describes the first data-driven parser for Vedic Sanskrit, an ancient Indo-Aryan language in which a corpus of important religious and philosophical texts has been composed. We report and critically discuss experiments with the input feature representations, paying special attention to the performance of contextualized word embeddings and to the influence of morpho-syntactic representations on the parsing quality. In addition, we provide an in-depth discussion of the parsing errors that covers structural traits of the predicted trees as well as linguistic and extra-textual influence factors. In its optimal configuration, the proposed model achieves 87.61 unlabeled and 81.84 labeled attachment score on a held-out set of test sentences, demonstrating good performance for an under-resourced language.
引用
收藏
页码:1173 / 1206
页数:34
相关论文
共 50 条
  • [1] Data-driven dependency parsing of Vedic Sanskrit
    Oliver Hellwig
    Sebastian Nehrdich
    Sven Sellmer
    Language Resources and Evaluation, 2023, 57 : 1173 - 1206
  • [2] Graph Transformations in Data-Driven Dependency Parsing
    Nilsson, Jens
    Nivre, Joakim
    Hall, Johan
    COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 257 - 264
  • [3] Data-driven deep-syntactic dependency parsing
    Ballesteros, Miguel
    Bohnet, Bernd
    Mille, Simon
    Wanner, Leo
    NATURAL LANGUAGE ENGINEERING, 2016, 22 (06) : 939 - 974
  • [4] Three Syntactic Formalisms for Data-Driven Dependency Parsing of Croatian
    Agic, Zeljko
    Merkler, Danijela
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 560 - 567
  • [5] MaltParser: A language-independent system for data-driven dependency parsing
    Nivre, Joakim
    Hall, Johan
    Nilsson, Jens
    Chanev, Atanas
    Eryigit, Güls¸en
    Kübler, Sandra
    Marinov, Svetoslav
    Marsi, Erwin
    Natural Language Engineering, 2007, 13 (02) : 95 - 135
  • [6] Cross-framework parser stacking for data-driven dependency parsing
    Ovrelid, Lilja
    Kuhn, Jonas
    Spreyer, Kathrin
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2009, 50 (03): : 109 - 138
  • [7] The incremental use of morphological information and lexicalization in data-driven dependency parsing
    Eryigit, Gulsen
    Nivre, Joakim
    Oflazer, Kemal
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 498 - +
  • [8] Comparing rule-based and data-driven dependency parsing of learner language
    Krivanek, Julia
    Meurers, Detmar
    1600, IOS Press BV (258): : 207 - 225
  • [9] Design of Chinese HPSG framework for data-driven parsing
    Wang, Xiangli
    Iwasawa, Shunya
    Miyao, Yusuke
    Matsuzaki, Takuya
    Yu, Kun
    Tsujii, Jun'ichi
    PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, 2 : 835 - 842
  • [10] The Treebank of Vedic Sanskrit
    Hellwig, Oliver
    Scarlata, Salvatore
    Ackermann, Elia
    Widmer, Paul
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5137 - 5146