Information Extraction from Thai Text with Unknown Phrase Boundaries

被引:0
|
作者
Intarapaiboon, Peerasak [1 ]
Nantajeewarawat, Ekawit [1 ]
Theeramunkong, Thanaruk [1 ]
机构
[1] Thammasat Univ, Sch Informat Comp & Commun Technol, Sirindhorn Int Inst Technol, Bangkok, Thailand
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS | 2009年 / 5476卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using sliding-window rule application and extraction filtering techniques; we propose a framework for extracting semantic frames from Thai textual phrases with unknown boundaries based on patterns of triggering terms. A supervised rifle learning algorithm is used for Constructing multi-slot extraction rules from hand-tagged training phrases. A filtering module is introduced for predicting rule application across phrase boundaries based on instantiation features of rule internal wild-cards. The framework is applied to text documents in three domains with different target-phrase density and average lengths. The experimental results show that the filtering module improves precision and preserves high recall satisfactorily, yielding extraction performance comparable to frame extraction with manually identified phrase boundaries.
引用
收藏
页码:525 / 532
页数:8
相关论文
共 50 条
  • [21] Phrase based feature extraction for musical information retrieval
    Yanase, Takashi
    Takasu, Atsuhiro
    Adachi, Jun
    IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings, 1999, : 396 - 399
  • [22] The contribution of mutual information in the intonational phrase prediction in chinese text
    Hu, GP
    Chen, BF
    Fan, M
    Wang, RH
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 407 - 412
  • [23] Phrase-based Clause Extraction for Open Information Extraction System
    Romadhony, Ade
    Widyantoro, Dwi H.
    Purwarianti, Ayu
    2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2015, : 155 - 162
  • [25] A Quantitative Study on Information Contribution of Prosody Phrase Boundaries in Chinese Speech
    Zhang, Jinsong
    Li, Wei
    Xie, Yanlu
    Cao, Wen
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 555 - 558
  • [26] The improvements of text rank for domain-specific key phrase extraction
    Wang Z.
    Feng Y.
    Li F.
    Wang, Zhijuan, 2016, UK Simulation Society, Clifton Lane, Nottingham, NG11 8NS, United Kingdom (17):
  • [27] A Feature Extraction Method Using Base Phrase and keyword In Chinese Text
    Li, Xin-fu
    Zhao, Lei-lei
    Wu, Li-hong
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 680 - +
  • [28] Automatic Open Domain Information Extraction from Indonesian Text
    Gultom, Yohanes
    Wibowo, Wahyu Catur
    2017 INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS 2017), 2017, : 23 - 30
  • [29] A hybrid system for temporal information extraction from clinical text
    Tang, Buzhou
    Wu, Yonghui
    Jiang, Min
    Chen, Yukun
    Denny, Joshua C.
    Xu, Hua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (05) : 828 - 835
  • [30] Information extraction from free-text business documents
    Abramowicz, W
    Piskorski, J
    ISSUES AND TRENDS OF INFORMATION TECHNOLOGY MANAGEMENT IN CONTEMPORARY ORGANIZATIONS, VOLS 1 AND 2, 2002, : 626 - 630