A computational grammar for a fragment of Nheengatu

被引:1
|
作者
de Alencar, Leonel Figueiredo [1 ]
机构
[1] Univ Fed Ceara UFC, Fortaleza, Ceara, Brazil
关键词
Amazonian Lingua Franca; Modern Tupi; qualifying predication; possessive construction; machine translation; computational linguistics; natural language processing;
D O I
10.17851/2237-2083.29.3.1717-1777
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
The availability of resources for computational processing is one of the survival factors of a language. The goal of this work was to implement a fragment of Nheengatu in the Grammatical Framework formalism, specially designed for the development of multilingual applications. Once more widely spoken than Portuguese in the Amazon region, Nheengatu is threatened with extinction, although it still has an estimated number of 14,000 speakers. The fragment is restricted to sentences that express contingent and non-contingent states, but includes structurally complex grammatical phenomena typical of the Tupi-Guarani family, which strongly contrast with the equivalent constructions in Portuguese and English. It constitutes one of the modules of GrammYEP, a multilingual computational grammar comprising equivalent English and Portuguese modules. The starting point of the implementation was the non-formalized grammatical descriptions of Navarro (2011) and Cruz (2011). The formalization revealed gaps and inconsistencies in these approaches, which were partly remedied through a reanalysis of the data. GrammYEP achieved quite satisfactory results in the translation from and to Nheengatu. It translated into Portuguese and English all examples from a test set with 142 Nheengatu sentences. Conversely, 98.18% and 84.11% of the corresponding Portuguese and English test sets were rendered into Nheengatu. On the other hand, it parsed only two examples from a negative test set with 171 ungrammatical constructions in Nheengatu. This evaluation resulted in a treebank with 243 Nheengatu sentences, paired with the equivalent sentences in Portuguese and English.
引用
收藏
页码:1717 / 1777
页数:61
相关论文
共 50 条
  • [1] MODALLY HYBRID GRAMMAR? CELESTIAL POINTING FOR TIME-OF-DAY REFERENCE IN NHEENGATU
    Floyd, Simeon
    [J]. LANGUAGE, 2016, 92 (01) : 31 - 64
  • [2] MODALLY HYBRID GRAMMAR? CELESTIAL POINTING FOR TIME-OF-DAY REFERENCE IN NHEENGATU
    Floyd, Simeon
    [J]. LANGUAGE, 2019, 95 : 31 - +
  • [3] A Computational Grammar for Georgian
    Meurer, Paul
    [J]. LOGIC, LANGUAGE, AND COMPUTATION, 2009, 5422 : 1 - 15
  • [4] Building an RRG computational grammar
    Cortes Rodriguez, Francisco
    Mairal-Uson, Ricardo
    [J]. ONOMAZEIN, 2016, (34): : 86 - 117
  • [5] Minimal Combination for Incremental Grammar Fragment Learning
    Sharef, Nurfadhlina Mohd
    Martin, Trevor
    Shen, Yun
    [J]. PROCEEDINGS OF THE JOINT 2009 INTERNATIONAL FUZZY SYSTEMS ASSOCIATION WORLD CONGRESS AND 2009 EUROPEAN SOCIETY OF FUZZY LOGIC AND TECHNOLOGY CONFERENCE, 2009, : 909 - 914
  • [6] An Open Source Persian Computational Grammar
    Virk, Shafqat Mumtaz
    Abolahrar, Elnaz
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1686 - 1693
  • [7] Computational benefits of a Totally Lexicalist Grammar
    Balogh, K
    Kleiber, J
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 114 - 119
  • [8] Computational Estonian Grammar in Grammatical Framework
    Listenmaa, Inari
    Kaljurand, Kaarel
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] Computational Comparison of the Uyghur and Turkish Grammar
    Orhun, Murat
    Tantug, A. Cueneyd
    Adali, Esref
    Soenmez, A. Coskun
    [J]. 2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 3, 2009, : 338 - +
  • [10] Computational approaches for fragment optimization
    Vangrevelinghe, Eric
    Ruedisser, Simon
    [J]. CURRENT COMPUTER-AIDED DRUG DESIGN, 2007, 3 (01) : 69 - 83