Construction of an English Dependency Corpus incorporating Compound Function Words

被引:0
|
作者
Kato, Akihiko [1 ]
Shindo, Hiroyuki [1 ]
Matsumoto, Yuji [1 ]
机构
[1] Nara Inst Sci Technol, 8916-5 Takayama, Nara 6300192, Japan
关键词
MultiWord Expressions; Dependency Parsing;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
The recognition of multiword expressions (MWEs) in a sentence is important for such linguistic analyses as syntactic and semantic parsing, because it is known that combining an MWE into a single token improves accuracy for various NLP tasks, such as dependency parsing and constituency parsing. However, MWEs are not annotated in Penn Treebank. Furthermore, when converting word-based dependency to MWE-aware dependency directly, one could combine nodes in an MWE into a single node. Nevertheless, this method often leads to the following problem: A node derived from an MWE could have multiple heads and the whole dependency structure including MWE might be cyclic. Therefore we converted a phrase structure to a dependency structure after establishing an MWE as a single subtree. This approach can avoid an occurrence of multiple heads and/or cycles. In this way, we constructed an English dependency corpus taking into account compound function words, which are one type of MWEs that serve as functional expressions. In addition, we report experimental results of dependency parsing using a constructed corpus.
引用
收藏
页码:1667 / 1671
页数:5
相关论文
共 50 条
  • [31] The Construction of Chinese-English Parallel Translation Corpus
    Hu, Weihua
    He, Haizhen
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 690 - 695
  • [32] Construction of Mizo: English Parallel Corpus for Machine Translation
    Haulai, Thangkhanhau
    Hussain, Jamal
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
  • [33] The Construction of the Building English Corpus Thought, Method and Application
    李信仕
    海外英语, 2012, (02) : 283 - 284
  • [34] Processing frequently misspelled words (А study based on an English learner corpus
    Klimova, Margarita A.
    Viklova, Anna V.
    Overnikova, Daria A.
    VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA-YAZYK I LITERATURA, 2023, 20 (04): : 824 - 837
  • [35] A concordancer for bilingual equivalent words in Chinese-English parallel corpus
    Wang, Lixin
    Chen, Guohua
    Yang, Muyun
    RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 197 - 200
  • [36] A Common Construction Pattern of English Words and Chinese Characters
    Huang, Jiping
    PLOS ONE, 2013, 8 (09):
  • [37] The role of information theory for compound words in Mandarin Chinese and English
    Hendrix, Peter
    Sun, Ching Chu
    COGNITION, 2020, 205
  • [38] The lexical representation of compound words in English: evidence from aphasia
    Fehringer, Carol
    LANGUAGE SCIENCES, 2012, 34 (01) : 65 - 75
  • [39] Semantics, linguistic headedness, and lexical representation of English compound words
    MarslenWilson, W
    Zhou, X
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1996, 31 (3-4) : 4482 - 4482
  • [40] Insight into the structure of compound words among speakers of Chinese and English
    Zhang, Jie
    Anderson, Richard C.
    Wang, Qiuying
    Packard, Jerome
    Wu, Xinchun
    Tang, Shan
    Ke, Xiaoling
    APPLIED PSYCHOLINGUISTICS, 2012, 33 (04) : 753 - 779