A Pilot Arabic CCGbank

被引:0
|
作者
Boxwell, Stephen A. [1 ]
Brew, Chris [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
关键词
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
We describe a process for converting the Penn Arabic Treebank into the CCG formalism. Previous efforts have yielded CCGbanks in English, German, and Turkish, thus opening these languages to the sophisticated computational tools developed for CCG and enabling further cross-linguistic development. Conversion from a context free grammar treebank to a CCGbank is a four stage process: head finding, argument classification, binarization, and category conversion. In the process of implementing a basic CCGbank conversion algorithm, we reveal properties of Arabic grammar that interfere with conversion, such as subject topicalization, genitive constructions, relative clauses, and optional pronominal subjects. All of these problematic phenomena can be resolved in a variety of ways - we discuss advantages and disadvantages of each in their respective sections. We detail these and describe our categorial analysis of each of these Arabic grammatical phenomena in depth, as well as technical details on their integration into the conversion algorithm.
引用
收藏
页码:1881 / 1888
页数:8
相关论文
共 50 条
  • [1] A Pilot Arabic Propbank
    Palmer, Martha
    Babko-Malaya, Olga
    Bies, Ann
    Diab, Mona
    Maamouri, Mohammed
    Mansouri, Aous
    Zaghouani, Wajdi
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3467 - 3472
  • [2] CLASSIFYING THE ARABIC WEB - A PILOT STUDY
    Abdeen, M.
    Elsehemy, A.
    Nazmy, T.
    Yagoub, M. C. E.
    [J]. 2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 865 - 868
  • [3] Projecting Propbank Roles onto the CCGbank
    Boxwell, Stephen A.
    White, Michael
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3112 - 3117
  • [4] Rebanking CCGbank for improved NP interpretation
    Honnibal, Matthew
    Curran, James R.
    Bos, Johan
    [J]. ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 207 - 215
  • [5] Mutual intelligibility of spoken Maltese, Libyan Arabic, and Tunisian Arabic functionally tested: A pilot study
    Ceplo, Slavomir
    Batora, Jan
    Benkato, Adam
    Milicka, Jiri
    Pereira, Christophe
    Zemanek, Petr
    [J]. FOLIA LINGUISTICA, 2016, 50 (02) : 583 - 628
  • [6] Hindi CCGbank: A CCG treebank from the Hindi dependency treebank
    Bharat Ram Ambati
    Tejaswini Deoskar
    Mark Steedman
    [J]. Language Resources and Evaluation, 2018, 52 : 67 - 100
  • [7] Hindi CCGbank: A CCG treebank from the Hindi dependency treebank
    Ambati, Bharat Ram
    Deoskar, Tejaswini
    Steedman, Mark
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (01) : 67 - 100
  • [8] Creating a CCGbank and a wide-coverage CCG lexicon for German
    Hockenmaier, Julia
    [J]. COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 505 - 512
  • [9] Development, psychometric properties, and pilot norms of the first Arabic indigenous memory test: The Verbal Memory Arabic Test (VMAT)
    Zeinoun, Pia
    Farran, Natali
    Khoury, Samia J.
    Darwish, Hala
    [J]. JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY, 2020, 42 (05) : 505 - 515
  • [10] Assessment of Cognitive Function in Arabic Speaking Population with Stroke: A pilot study
    Almubark, B.
    Floccia, C.
    Cattani, A.
    Slade, A.
    [J]. CEREBROVASCULAR DISEASES, 2015, 39 : 262 - 262