Information Extraction from Thai Text with Unknown Phrase Boundaries

被引:0
|
作者
Intarapaiboon, Peerasak [1 ]
Nantajeewarawat, Ekawit [1 ]
Theeramunkong, Thanaruk [1 ]
机构
[1] Thammasat Univ, Sch Informat Comp & Commun Technol, Sirindhorn Int Inst Technol, Bangkok, Thailand
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS | 2009年 / 5476卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using sliding-window rule application and extraction filtering techniques; we propose a framework for extracting semantic frames from Thai textual phrases with unknown boundaries based on patterns of triggering terms. A supervised rifle learning algorithm is used for Constructing multi-slot extraction rules from hand-tagged training phrases. A filtering module is introduced for predicting rule application across phrase boundaries based on instantiation features of rule internal wild-cards. The framework is applied to text documents in three domains with different target-phrase density and average lengths. The experimental results show that the filtering module improves precision and preserves high recall satisfactorily, yielding extraction performance comparable to frame extraction with manually identified phrase boundaries.
引用
收藏
页码:525 / 532
页数:8
相关论文
共 50 条
  • [1] Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
    Intarapaiboon, Peerasak
    Nantajeewarawat, Ekawit
    Theeramunkong, Thanaruk
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (03): : 465 - 478
  • [2] An Application of Intuitionistic Fuzzy Sets to Improve Information Extraction from Thai Unstructured Text
    Intarapaiboon, Peerasak
    Theeramunkong, Thanaruk
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09): : 2334 - 2345
  • [3] The State of Knowledge Extraction from Text for Thai Language
    Netisopakul, Ponrudee
    Wohlgenannt, Gerhard
    2017 6TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS (IIAI-AAI), 2017, : 379 - 384
  • [4] Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text
    Rott, Michal
    Cerva, Petr
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 101 - 108
  • [5] Information extraction from biomedical text
    Hobbs, JR
    JOURNAL OF BIOMEDICAL INFORMATICS, 2002, 35 (04) : 260 - 264
  • [6] Extracting Thai compound nouns for paragraph extraction in Thai text
    Suwanno, N
    Suzuki, Y
    Yamazaki, H
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 657 - 662
  • [7] A Context Free Gramma for Key Noun-Phrase Extraction from Text
    Liu, Ying
    PROCEEDINGS OF THE 52ND ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2019, : 1174 - 1183
  • [8] Traffic Information Extraction and Classification from Thai Twitter
    Klaithin, Supon
    Haruechaiyasak, Choochart
    2016 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2016, : 82 - 87
  • [9] Towards Idea Mining: Problem-Solution Phrase Extraction from Text
    Liu, Haixia
    Brailsford, Tim
    Goulding, James
    Maul, Tomas
    Tan, Tao
    Chaudhuri, Debanjan
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II, 2022, 13726 : 3 - 14
  • [10] From Text to XML by Structural Information Extraction
    Piao, Yong
    Wang, Tianyu
    Jiang, He
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2015, : 448 - 452