A computational grammar for a fragment of Nheengatu

被引:1
|
作者
de Alencar, Leonel Figueiredo [1 ]
机构
[1] Univ Fed Ceara UFC, Fortaleza, Ceara, Brazil
关键词
Amazonian Lingua Franca; Modern Tupi; qualifying predication; possessive construction; machine translation; computational linguistics; natural language processing;
D O I
10.17851/2237-2083.29.3.1717-1777
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
The availability of resources for computational processing is one of the survival factors of a language. The goal of this work was to implement a fragment of Nheengatu in the Grammatical Framework formalism, specially designed for the development of multilingual applications. Once more widely spoken than Portuguese in the Amazon region, Nheengatu is threatened with extinction, although it still has an estimated number of 14,000 speakers. The fragment is restricted to sentences that express contingent and non-contingent states, but includes structurally complex grammatical phenomena typical of the Tupi-Guarani family, which strongly contrast with the equivalent constructions in Portuguese and English. It constitutes one of the modules of GrammYEP, a multilingual computational grammar comprising equivalent English and Portuguese modules. The starting point of the implementation was the non-formalized grammatical descriptions of Navarro (2011) and Cruz (2011). The formalization revealed gaps and inconsistencies in these approaches, which were partly remedied through a reanalysis of the data. GrammYEP achieved quite satisfactory results in the translation from and to Nheengatu. It translated into Portuguese and English all examples from a test set with 142 Nheengatu sentences. Conversely, 98.18% and 84.11% of the corresponding Portuguese and English test sets were rendered into Nheengatu. On the other hand, it parsed only two examples from a negative test set with 171 ungrammatical constructions in Nheengatu. This evaluation resulted in a treebank with 243 Nheengatu sentences, paired with the equivalent sentences in Portuguese and English.
引用
收藏
页码:1717 / 1777
页数:61
相关论文
共 50 条
  • [31] Computational tools for facilitating the fragment to lead process
    Wall, Ian D.
    Pickett, Stephen D.
    Saunders, Martin R.
    Rea, Ceara
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 244
  • [32] Complementarity between empirical and computational fragment screens
    Barelier, Sarah
    Eidam, Oliv
    Fish, Inbar
    Hollander, Johan
    Figaroa, Francis
    Nachane, Ruta
    Irwin, John
    Shoichet, Brian
    Siegal, Gregg
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [33] Computational techniques in fragment based drug discovery
    Villar, Hugo O.
    Hansen, Mark R.
    [J]. CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2007, 7 (15) : 1509 - 1513
  • [34] Computational tools for fragment based drug design
    Johnson, A. Peter
    Zsoldos, Zsolt
    Valko, Aniko
    Valko, Vilmos
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2009, 237
  • [35] A fragment of sanskrit grammar from Sangim Agiz, Chinese-Turkistan.
    Sieg, E
    [J]. SITZUNGSBERICHTE DER KONIGLICH PREUSSISCHEN AKADEMIE DER WISSENSCHAFTEN, 1907, : 466 - U15
  • [36] THE SUBJECT FUNCTION IN A FLEMISH DIALECT OF BRABANT (LEUVEN), FRAGMENT OF A DESCRIPTIVE GRAMMAR
    SWIGGERS, P
    [J]. LINGUISTIQUE, 1987, 23 (01): : 123 - 129
  • [37] DESCRIPTION AND PHONOLOGICAL DOCUMENTATION OF THE VARIETIES OF NHEENGATU IN AMAZONAS
    Pereira da Silva, Raynice Geraldine
    da Cruz, Aline
    Lima Schwade, Micheli de Deus
    [J]. REVISTA DE LETRAS NORTE@MENTOS, 2020, 13 (33): : 148 - 171
  • [38] Aspects of the construction of a Universal Dependencies treebank for Nheengatu
    de Alencar, Leonel Figueiredo
    [J]. TEXTO LIVRE-LINGUAGEM E TECNOLOGIA, 2024, 14
  • [39] The eight parts of speech in the Tupinamba/Nheengatu grammatical tradition
    Altman, Cristina
    [J]. LIMITE-REVISTA DE ESTUDIOS PORTUGUESES Y DE LA LUSOFONIA, 2012, 6 : 11 - 51
  • [40] Theoretical and computational considerations of linking constructions in Role and Reference Grammar
    Nolan, Brian
    [J]. REVIEW OF COGNITIVE LINGUISTICS, 2014, 12 (02): : 410 - 442