Annotating an Arabic Learner Corpus for Error

被引:0
|
作者
Abuhakema, Ghazi [1 ]
Faraj, Reem [1 ]
Feldman, Anna [1 ]
Fitzpatrick, Eileen [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
来源
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 | 2008年
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.
引用
收藏
页码:1347 / 1350
页数:4
相关论文
共 50 条
  • [41] Annotating the Enron Email Corpus with Number Senses
    Moore, Stuart
    Buchholz, Sabine
    Korhonen, Anna
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1452 - 1455
  • [42] Annotating and Analyzing Emotions in a Corpus of First Encounters
    Navarretta, Costanza
    3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 433 - 438
  • [43] John of Scythopolis and the Dionysian Corpus: Annotating the Areopagite
    Ritter, AM
    VIGILIAE CHRISTIANAE, 2002, 56 (02) : 213 - 218
  • [44] Annotating opinion—evaluation of blogs: the Blogoscopy corpus
    Béatrice Daille
    Estelle Dubreil
    Laura Monceaux
    Matthieu Vernier
    Language Resources and Evaluation, 2011, 45 : 409 - 437
  • [45] SYNTACTICALLY ANNOTATING A COMPONENT OF THE CORPUS OF CONTEMPORARY ROMANIAN
    Irimia, Elena
    Barbu Mititelu, Verginica
    PROCEEDINGS OF THE ROMANIAN ACADEMY SERIES A-MATHEMATICS PHYSICS TECHNICAL SCIENCES INFORMATION SCIENCE, 2016, 17 (03): : 277 - 284
  • [46] John of Scythopolis and the Dionysian corpus: Annotating the Areopagite
    Meconi, DV
    REVIEW OF METAPHYSICS, 2000, 53 (04): : 952 - 953
  • [47] Arabic Corpus Linguistics
    Al-Surmi, Mansoor
    CORPORA, 2021, 16 (02) : 301 - 303
  • [48] John of Scythopolis and the Dionysian corpus: Annotating the Areopagite
    不详
    JOURNAL OF RELIGION, 2001, 81 (01): : 113 - 114
  • [49] Annotating Bhojpuri Corpus using BIS Scheme
    Singh, Srishti
    Banerjee, Esha
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [50] THE DEVELOPMENT OF A CHINESE LEARNER CORPUS
    Wang, Maolin
    Gong, Qi
    Kuang, Jie
    Xiong, Ziyu
    2012 INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2012, : 1 - 6