Mining Relevant Text Features for Retrieving Web Information

被引:1
|
作者
Pipanmekaporn, Luepol [1 ]
Kamolsantiroj, Suwatchai [1 ]
机构
[1] King Mongkuts Univ Technol North Bangkok, Dept Comp & Informat Sci, Bangkok 10800, Thailand
关键词
Feature Extraction; Feature Selection; Relevance Feedback and Text Mining;
D O I
10.1109/IIAI-AAI.2014.96
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is a big challenge to develop effective methods that can discover high quality and useful features in text documents. Most existing information retrieval and text mining methods focuses on term-based approach that often suffers from the problems of term variation and noise. This paper illustrates an innovative approach that discovers relevant knowledge to precisely describe text features for retrieving web information. In particular, it extracts precise text patterns by considering both relevant and irrelevant documents. Then, the discovered patterns are used to find accurate relevant features in a training set. The proposed approach has been evaluated through the implementation of a novel information filtering model and a comparative evaluation is conducted by invoking state-of-the-art models. The experimental results obtained based on the Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms the best baseline method.
引用
收藏
页码:447 / 452
页数:6
相关论文
共 50 条
  • [21] Applying passage in Web text mining
    Theeramunkong, T
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2004, 19 (1-2) : 149 - 158
  • [22] DATA PREPROCESSING IN WEB TEXT MINING
    Jiang Yongbo
    [J]. FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2012), 2012, : 573 - 581
  • [23] A Parallel Platform for Web Text Mining
    Ping Lu
    Zhenjiang Dong
    Shengmei Luo
    Lixia Liu
    Shanshan Guan
    Shengyu Liu
    Qingcai Chen
    [J]. ZTE Communications, 2013, 11 (03) : 56 - 61
  • [24] Guest Editorial: Text and Web Mining
    Ah-Hwee Tan
    Philip S. Yu
    [J]. Applied Intelligence, 2003, 18 : 239 - 241
  • [25] Guest editorial: Text and web mining
    Tan, AH
    Yu, PS
    [J]. APPLIED INTELLIGENCE, 2003, 18 (03) : 239 - 241
  • [26] A Web Text Mining Flexible Architecture
    Castellano, M.
    Mastronardi, G.
    Aprile, A.
    Tarricone, G.
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 78 - +
  • [27] Mining Text Snippets for Images on the Web
    Kannan, Anitha
    Baker, Simon
    Ramnath, Krishnan
    Fiss, Juliet
    Lin, Dahua
    Vanderwende, Lucy
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1534 - 1543
  • [28] Mining Temporal Information and Web-casting Text for Automatic Sports Event Detection
    Dao, Minh-Son
    Babaguchi, Noburu
    [J]. 2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 620 - 625
  • [29] Overview of Mondou web search engine using text mining and information visualizing technologies
    Kawano, H
    [J]. 2000 KYOTO INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES: RESEARCH AND PRACTICE, PROCEEDINGS, 2000, : 234 - 241
  • [30] Rough association rule mining in text documents for acquiring web user information needs
    Li, Yuefeng
    Zhong, Ning
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 226 - +