Semantic Pattern Mining for Text Mining

被引:0
|
作者
Song, Xiaoli [1 ]
Wang, XiaoTong [2 ]
Hu, Xiaohua [1 ]
机构
[1] Drexel Univ, Coll Comp & Informat, Philadelphia, PA 19104 USA
[2] San Jose State Univ, Coll Comp Engn, San Jose, CA 95192 USA
关键词
ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Pattern mining is a fundamental topic in data mining area. Many pattern mining techniques, such as closed and maximal pattern mining have been proposed for different applications. However, when calculating the frequency of a pattern, the existing techniques treat each word equally. For example, although the word 'pie' in 'I love eating pie.' is quite different from 'pie' in 'american pie', 'pie' in 'american pie' will still be added up to the counts of 'pie' when calculating its frequency. Therefore, this paper aims to overcome the drawback to find the valid patterns tailored to text mining. We will approach pattern mining from a different perspective and introduce a novel problem of frequent semantic pattern mining. We then propose an algorithm to solve this problem via suffix array sorting. The algorithm can be implemented to run in linear time. Compared with traditional pattern representations, our results show the semantic patterns extracted are more than 13% compact. Also, classifier built on these features is no less or more powerful.
引用
收藏
页码:150 / 155
页数:6
相关论文
共 50 条
  • [1] Overview and semantic issues of text mining
    Stavrianou, Anna
    Andritsos, Periklis
    Nicoloyannis, Nicolas
    [J]. SIGMOD RECORD, 2007, 36 (03) : 23 - 34
  • [2] Using text mining to infer semantic attributes for retail data mining
    Ghani, R
    Fano, AE
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 195 - 202
  • [3] Pattern and Cluster Mining on Text Data
    Agnihotri, Deepak
    Verma, Kesari
    Tripathi, Priyanka
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 428 - 432
  • [4] Effective Pattern Discovery for Text Mining
    Zhong, Ning
    Li, Yuefeng
    Wu, Sheng-Tang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (01) : 30 - 44
  • [5] An Adaptive Latent Semantic Analysis for Text mining
    Hong T. Tu
    Tuoi T. Phan
    Khu P. Nguyen
    [J]. 2017 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2017, : 588 - 593
  • [6] SAO Semantic Information Identification for Text Mining
    Chao Yang
    Donghua Zhu
    Xuefeng Wang
    [J]. International Journal of Computational Intelligence Systems, 2017, 10 : 593 - 604
  • [7] Semantic text mining support for lignocellulose research
    Meurs, Marie-Jean
    Murphy, Caitlin
    Morgenstern, Ingo
    Butler, Greg
    Powlowski, Justin
    Tsang, Adrian
    Witte, Rene
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
  • [8] Semantic Graph Based Approach for Text Mining
    Yadav, Chandra Shekhar
    Sharan, Aditi
    Joshi, Manju Lata
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), 2014, : 596 - 601
  • [9] SAO Semantic Information Identification for Text Mining
    Yang, Chao
    Zhu, Donghua
    Wang, Xuefeng
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2017, 10 (01) : 593 - 604
  • [10] Semantic text mining support for lignocellulose research
    Marie-Jean Meurs
    Caitlin Murphy
    Ingo Morgenstern
    Greg Butler
    Justin Powlowski
    Adrian Tsang
    René Witte
    [J]. BMC Medical Informatics and Decision Making, 12