An Automated Term Definition Extraction System Using the Web Corpus in the Chinese Language

被引:0
|
作者
Leu, Fang-Yie [1 ]
Ko, Chih-Chieh [1 ]
机构
[1] Tunghai Univ, Dept Comp Sci, Taichung 407, Taiwan
基金
俄罗斯基础研究基金会;
关键词
definitions; web corpus; information extraction; Chinese language; text mining;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a system, named Del-Explorer, which analyzes the type of given Chinese terms, extracts term definitions from the Web, and selects answers from noisy Web pages. DefExplorer tillers out invalid data with a semantic approach. Two types of candidate sets, common and domain specific, are employed to cluster similar candidates into groups. Different approaches are also deployed to evaluate candidates' importance which is the key factor for selecting the best answers from retrieved candidates. Experimental results show that DefExplorer can effectively extract term definitions from the Web, especially for the definitions of out-of-vocabulary terms.
引用
收藏
页码:505 / 525
页数:21
相关论文
共 50 条
  • [41] Web Page Information Extraction System by Using Deep Learning
    Pakyurek, Muhammet
    Sezgin, Mehmet Selman
    Kulac, Selman
    2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 145 - 149
  • [42] Verb-Object Relationship Extraction from Raw Chinese Corpus Using Template Matching
    Zhang, Kaixu
    Sun, Maosong
    11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 395 - 401
  • [43] A Practical System of Domain Ontology Learning Using the Web for Chinese
    Tian, Fang
    Jiang, Peilin
    Ren, Fuji
    2009 FOURTH INTERNATIONAL CONFERENCE ON INTERNET AND WEB APPLICATIONS AND SERVICES, 2009, : 298 - 303
  • [44] GA based optimal keyword extraction in an automatic chinese web document classification system
    Chou, Chih-Hsun
    Han, Chin-Chuan
    Chen, Ya-Hui
    FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2007 WORKSHOPS, 2007, 4743 : 224 - +
  • [45] WEB ASSISTED LANGUAGE LEARNING SYSTEM FOR ENHANCING ARABIC LANGUAGE LEARNING USING COGNATES
    Shehab, Roaya
    Zeki, Akram M.
    JURNAL TEKNOLOGI, 2015, 77 (19): : 107 - 112
  • [46] Crude fat determination in maize grain and canola seed using an automated hydrolysis system and an automated extraction system
    Easton, Ariane
    Claussen, Fred A.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2013, 246
  • [47] Automated Extraction of Compliance Elements in Software Engineering Contracts Using Natural Language Generation
    Rejithkumar, Gokul
    Anish, Preethu Rose
    Sonar, Pratik
    Ghaisas, Smita
    PROCEEDINGS 2024 ACM/IEEE INTERNATIONAL WORKSHOP ON NL-BASED SOFTWARE ENGINEERING, NLBSE 2024, 2024, : 69 - 72
  • [48] Automated Extraction of Pain Symptoms: A Natural Language Approach using Electronic Health Records
    Dave, Amisha D.
    Ruano, Gualberto
    Kost, Jonathan
    Wang, Xiaoyan
    PAIN PHYSICIAN, 2022, 25 (02) : E245 - E254
  • [49] A Chinese Sign Language Recognition System Using Leap Motion
    Xue, Yaofeng
    Gao, Shang
    Sun, Huali
    Qin, Wei
    2017 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2017), 2017, : 180 - 185
  • [50] Hierarchical topic term extraction for Semantic annotation in Chinese bulletin board system
    Wu, Xiaoyuan
    Huang, Shen
    Zhang, Jie
    Yu, Yong
    SEMANTIC WEB - ASWC 2006, PROCEEDINGS, 2006, 4185 : 30 - 43