Annotating an Arabic Learner Corpus for Error

被引:0
|
作者
Abuhakema, Ghazi [1 ]
Faraj, Reem [1 ]
Feldman, Anna [1 ]
Fitzpatrick, Eileen [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
来源
SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 | 2008年
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.
引用
收藏
页码:1347 / 1350
页数:4
相关论文
共 50 条
  • [21] Automatic Error Detection concerning the Definite and Indefinite Conjugation in the Hun Learner Corpus
    Vincze, Veronika
    Zsibrita, Janos
    Durst, Peter
    Szabo, Martina Katalin
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3958 - 3962
  • [22] THE HOWS AND WHYS OF CODING CATEGORIES IN A LEARNER CORPUS (OR "HOW AND WHY AN ERROR-TAGGED LEARNER CORPUS IS NOT IPSO FACTO ONE BIG COMPARATIVE FALLACY")
    Tenfjord, Kari
    Hagen, Jon Erik
    Johansen, Hilde
    RIVISTA DI PSICOLINGUISTICA APPLICATA-JOURNAL OF APPLIED PSYCHOLINGUISTICS, 2006, 6 (03): : 93 - 108
  • [23] Annotation of a Learner Corpus toward Development of an Error-cause Presenting Technique
    Kotani, Katsunori
    Yoshimi, Takehiko
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES ENHANCING EDUCATION (ICAT2E 2017), 2017, 68 : 78 - 81
  • [24] How useful are corpus tools for error correction? Insights from learner data
    Dolgova, Natalia
    Mueller, Charles
    JOURNAL OF ENGLISH FOR ACADEMIC PURPOSES, 2019, 39 : 97 - 108
  • [25] The IFCASL corpus as a phonetic learner corpus
    Trouvain, Jurgen
    ZEITSCHRIFT FUR GERMANISTISCHE LINGUISTIK, 2022, 50 (01): : 82 - 103
  • [26] Error-Tagging of CroLTeC (Electronic Learner Corpus of Croatian as a Foreign Language)
    Preradovic, Nives Mikelic
    RASPRAVE, 2020, 46 (02): : 899 - 920
  • [27] FrSemCor: Annotating a French corpus with supersenses
    Barque, L.
    Haas, P.
    Huyghe, R.
    Tribout, D.
    Candito, M.
    Crabbe, B.
    Segonne, V
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5904 - 5910
  • [28] Annotating Arguments in a Corpus of Opinion Articles
    Rocha, Gil
    Trigo, Luis
    Cardoso, Henrique Lopes
    Sousa-Silva, Rui
    Carvalho, Paula
    Martins, Bruno
    Won, Miguel
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1890 - 1899
  • [29] Annotating Arguments in a Parliamentary Corpus: An Experience
    Koit, Mare
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 213 - 218
  • [30] Building a learner corpus
    Hana, Jirka
    Rosen, Alexandr
    Stindlova, Barbora
    Jaeger, Petr
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3228 - 3232