Research on Automatic Chinese Multi-word Term Extraction Based on Term Component

被引:0
|
作者
Kang, Wei [1 ]
Sui, Zhifang [1 ]
机构
[1] Peking Univ, Inst Computat Linguisitcs, Beijing 100871, Peoples R China
关键词
Chinese terminology; Automatic terminology extraction; Term component; Unithood; Termhood;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an automatic Chinese multi-word term extraction method based on the unithood and the termhood measure. The unithood of the candidate term is measured by the strength of inner unity and marginal variety. Term component is taken into account to estimate the termhood. Inspired by the economical law of term generating, we propose two measures of a candidate term to be a true term: the first measure is based on domain speciality of term, and the second one is based on the similarity between a candidate and a template that contains structured information of terms. Experiments on I.T. domain and Medicine domain show that our method is effective and portable in different domains.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 50 条
  • [41] Topic Detection and Multi-word Terms Extraction for Arabic Unvowelized Documents
    Koulali, Rim
    Meziane, Ahdelouafi
    [J]. INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 614 - 623
  • [42] Semi-compositional Method for Synonym Extraction of Multi-Word Terms
    Hazem, Amir
    Daille, Beatrice
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1202 - 1207
  • [43] A Language-Independent Hybrid Approach for Multi-Word Expression Extraction
    Liang, Yinghong
    Tan, Hongye
    Li, Hui
    Wang, Zhigang
    Gui, Wenming
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3273 - 3279
  • [44] Chinese Term Extraction Based on PAT Tree
    张锋
    樊孝忠
    许云
    [J]. Journal of Beijing Institute of Technology, 2006, (02) : 162 - 166
  • [45] Constraint Based Description of Polish Multi-word Expressions
    Kurc, Roman
    Piasecki, Maciej
    Broda, Bartosz
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2408 - 2413
  • [46] Research of automatic Chinese word segmentation
    Liu, KY
    Zheng, JH
    [J]. 2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 805 - 809
  • [47] Webpage automatic summary extraction based on term frequency
    Yu, Yangxin
    Wang, Liuyang
    [J]. CIVIL, ARCHITECTURE AND ENVIRONMENTAL ENGINEERING, VOLS 1 AND 2, 2017, : 1293 - 1297
  • [48] NMF-based approach to automatic term extraction
    Nugumanova, Aliya
    Akhmed-Zaki, Darkhan
    Mansurova, Madina
    Baiburin, Yerzhan
    Maulit, Almasbek
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 199
  • [49] Automatic term extraction based on perplexity of compound words
    Yoshida, M
    Nakagawa, H
    [J]. NATURAL LANGUAGE PROCESSING - IJCNLP 2005, PROCEEDINGS, 2005, 3651 : 269 - 279
  • [50] Text classification based on multi-word with support vector machine
    Zhang, Wen
    Yoshida, Taketoshi
    Tang, Xijin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 879 - 886