An Automated Term Definition Extraction System Using the Web Corpus in the Chinese Language

被引:0
|
作者
Leu, Fang-Yie [1 ]
Ko, Chih-Chieh [1 ]
机构
[1] Tunghai Univ, Dept Comp Sci, Taichung 407, Taiwan
基金
俄罗斯基础研究基金会;
关键词
definitions; web corpus; information extraction; Chinese language; text mining;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a system, named Del-Explorer, which analyzes the type of given Chinese terms, extracts term definitions from the Web, and selects answers from noisy Web pages. DefExplorer tillers out invalid data with a semantic approach. Two types of candidate sets, common and domain specific, are employed to cluster similar candidates into groups. Different approaches are also deployed to evaluate candidates' importance which is the key factor for selecting the best answers from retrieved candidates. Experimental results show that DefExplorer can effectively extract term definitions from the Web, especially for the definitions of out-of-vocabulary terms.
引用
收藏
页码:505 / 525
页数:21
相关论文
共 50 条
  • [31] Using Information Uncertainty for Keywords Extraction in Language System
    Yan, Rong
    Pan, Yunfei
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 607 - 611
  • [32] Development of a Biotechnologically-Enhanced Interactive Learning System for College-Level Chinese Language Using Biological Corpus Technology
    Xie, Xiaoyan
    Journal of Commercial Biotechnology, 2024, 29 (02) : 248 - 259
  • [33] Using Semi-automated Term Extraction for IsiNdebele Health Terminology
    Malele, Nomsebenzi
    Bosch, Sonja
    LEXIKOS, 2024, 34 : 269 - 287
  • [34] Research on Automatic Chinese Multi-word Term Extraction Based on Integration of Web Information and Term Component
    Kang, Wei
    Sui, Zhifang
    Liu, Yao
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 267 - +
  • [35] Automated knowledge extraction from polymer literature using natural language processing
    Shetty, Pranav
    Ramprasad, Rampi
    ISCIENCE, 2021, 24 (01)
  • [36] A Chinese language expert system using Bayesian learning
    Wu, YY
    Zhang, JJ
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIV, PROCEEDINGS: COMPUTER AND INFORMATION SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2004, : 90 - 95
  • [37] Application system requirements definition using the unified modeling language (UML)
    Jackson, RB
    ASSOCIATION FOR INFORMATION SYSTEMS PROCEEDINGS OF THE AMERICAS CONFERENCE ON INFORMATION SYSTEMS, 1998, : 689 - 691
  • [38] Automated Text Extraction from Images using OCR System
    Kaundilya, Chandni
    Chawla, Diksha
    Chopra, Yatin
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 145 - 150
  • [39] An argument-based decision support system for assessing natural language usage on the basis of the Web Corpus
    Ivan Chesnevar, Carlos
    Sabate-Carrove, Mariona
    Maguitman, Ana Gabriela
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2006, 21 (11) : 1151 - 1180
  • [40] Web Page Information Extraction System by Using Deep Learning
    Pakyurek, Muhammet
    Sezgin, Mehmet Selman
    Kulac, Selman
    2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 361 - 365