An Automated Term Definition Extraction System Using the Web Corpus in the Chinese Language

被引:0
|
作者
Leu, Fang-Yie [1 ]
Ko, Chih-Chieh [1 ]
机构
[1] Tunghai Univ, Dept Comp Sci, Taichung 407, Taiwan
基金
俄罗斯基础研究基金会;
关键词
definitions; web corpus; information extraction; Chinese language; text mining;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a system, named Del-Explorer, which analyzes the type of given Chinese terms, extracts term definitions from the Web, and selects answers from noisy Web pages. DefExplorer tillers out invalid data with a semantic approach. Two types of candidate sets, common and domain specific, are employed to cluster similar candidates into groups. Different approaches are also deployed to evaluate candidates' importance which is the key factor for selecting the best answers from retrieved candidates. Experimental results show that DefExplorer can effectively extract term definitions from the Web, especially for the definitions of out-of-vocabulary terms.
引用
收藏
页码:505 / 525
页数:21
相关论文
共 50 条
  • [1] An automated term definition extraction using the web corpus in chinese language
    Leu, Fang-Yie
    Ko, Chih-Chieh
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 435 - +
  • [2] Towards automated phenotype definition extraction using large language models
    Ramya Tekumalla
    Juan M. Banda
    Genomics & Informatics, 22 (1)
  • [3] Chinese Historical Term Translation Pairs Extraction Using Modern Chinese as a Pivot Language
    Wu, Xiaoting
    Zhao, Hanyu
    Jing, Lei
    Che, Chao
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 358 - 367
  • [4] Web text corpus extraction system for linguistic tasks
    Cadavid Rengifo, Hector Fabio
    Gomez Perdomo, Jonatan
    INGENIERIA E INVESTIGACION, 2009, 29 (03): : 54 - 60
  • [5] A novel feature selection framework in Chinese term definition extraction
    Pan, Xu
    Gu, Hong-Bin
    Zhao, Zhi-Qmg
    Information Technology Journal, 2012, 11 (01) : 148 - 153
  • [6] Web-based Chinese term extraction in the field of study
    Guo, Rui
    Qiu, Jing
    Zhang, Guanghua
    2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 133 - 139
  • [7] Specific Web Spider Design for the Extraction of Unknown Chinese Words from BBS Corpus
    Xiong, Hai-ling
    Du, Jing
    2009 SECOND INTERNATIONAL CONFERENCE ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT ENGINEERING, FITME 2009, 2009, : 499 - 502
  • [8] An algorithm of Chinese domain term extraction based on language feature
    Fu, Ji-Bin
    Fan, Xiao-Zhong
    Mao, Jin-Tao
    Yu, Zheng-Tao
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2010, 30 (03): : 307 - 310
  • [9] A fully automated object extraction system for the World Wide Web
    Buttler, D
    Liu, L
    Pu, C
    21ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 2001, : 361 - 370
  • [10] Chinese terminology extraction using bilingual web resources
    Yang, Yuhang
    Lu, Qin
    Ji, Luning
    Zhao, Tiejun
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 347 - +