Information Extraction from Thai Text with Unknown Phrase Boundaries

被引:0
|
作者
Intarapaiboon, Peerasak [1 ]
Nantajeewarawat, Ekawit [1 ]
Theeramunkong, Thanaruk [1 ]
机构
[1] Thammasat Univ, Sch Informat Comp & Commun Technol, Sirindhorn Int Inst Technol, Bangkok, Thailand
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS | 2009年 / 5476卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using sliding-window rule application and extraction filtering techniques; we propose a framework for extracting semantic frames from Thai textual phrases with unknown boundaries based on patterns of triggering terms. A supervised rifle learning algorithm is used for Constructing multi-slot extraction rules from hand-tagged training phrases. A filtering module is introduced for predicting rule application across phrase boundaries based on instantiation features of rule internal wild-cards. The framework is applied to text documents in three domains with different target-phrase density and average lengths. The experimental results show that the filtering module improves precision and preserves high recall satisfactorily, yielding extraction performance comparable to frame extraction with manually identified phrase boundaries.
引用
收藏
页码:525 / 532
页数:8
相关论文
共 50 条
  • [41] Unsupervised technical phrase extraction by incorporating structure and position information
    Zhou, Peng
    Jiang, Xin
    Zhao, Shu
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
  • [42] Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval
    Intarapaiboon, Peerasak
    Nantajeewarawat, Ekawit
    Theeramunkong, Thanaruk
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (03): : 479 - 486
  • [43] Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval
    Intarapaiboon, Peerasak
    Nantajeewarawat, Ekawit
    Theeramunkong, Thanaruk
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT I, PROCEEDINGS, 2010, 5990 : 271 - 281
  • [44] Automated Phrase Mining from Massive Text Corpora
    Shang, Jingbo
    Liu, Jialu
    Jiang, Meng
    Ren, Xiang
    Voss, Clare R.
    Han, Jiawei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (10) : 1825 - 1837
  • [45] Scalable Topical Phrase Mining from Text Corpora
    El-Kishky, Ahmed
    Song, Yanglei
    Wang, Chi
    Voss, Clare R.
    Han, Jiawei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 8 (03): : 305 - 316
  • [46] Text mining via information extraction
    Feldman, R
    Aumann, Y
    Fresko, M
    Liphstat, O
    Rosenfeld, B
    Schler, Y
    PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 1704 : 165 - 173
  • [47] Phrase Boundary Assignment from Text in Multiple Domains
    Rosenberg, Andrew
    Fernandez, Raul
    Ramabhadran, Bhuvana
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2557 - 2560
  • [48] Towards automatic multilevel indexing for Thai text information retrieval
    Kasetsart Univ, Bangkok, Thailand
    IEEE Asia Pac Conf Circuits Syst Proc, (551-554):
  • [49] Role of Text Mining in Information Extraction and Information Management
    Natarajan, M.
    DESIDOC JOURNAL OF LIBRARY & INFORMATION TECHNOLOGY, 2005, 25 (04): : 31 - 38
  • [50] Towards automatic multilevel indexing for Thai text information retrieval
    Kawtrakul, A
    Thumkanon, C
    McFetridge, P
    APCCAS '98 - IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS: MICROELECTRONICS AND INTEGRATING SYSTEMS, 1998, : 551 - 554