The feature extraction of text mining based on Web

被引:0
|
作者
Liu, LZ [1 ]
Chen, JJ [1 ]
Song, HT [1 ]
机构
[1] Beijing Inst Technol, Dept Comp, Beijing 100081, Peoples R China
关键词
Web; text mining; extract;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A mass of information on WWW usually hides within Web electronic documents. So mining contents of Web pages is a kind of important application. Text mining depends on finding out feature set of documents. The text features are extracted by terms frequency based on Vector Space Model, and feature subset is selected according to evaluating function to reduce high vector dimensions. The selection of feature decides the efficiency of text mining.
引用
收藏
页码:547 / 550
页数:4
相关论文
共 50 条
  • [41] Title-Based Extraction of News Contents for Text Mining
    Tan, Zhen
    He, Chunhui
    Fang, Yang
    Ge, Bin
    Xiao, Weidong
    [J]. IEEE ACCESS, 2018, 6 : 64085 - 64095
  • [42] Rough set based feature selection for web usage mining
    Inbarani, H. Hannah
    Thangavel, K.
    Pethalakshmi, A.
    [J]. ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL I, PROCEEDINGS, 2007, : 33 - +
  • [43] Feature Extraction based on Principal Component Analysis for Text Categorization
    Lhazmir, Safae
    El Moudden, Ismail
    Kobbane, Abdellatif
    [J]. 2017 INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION AND MODELING IN WIRED AND WIRELESS NETWORKS (PEMWN), 2017,
  • [44] Ternary encoding based feature extraction for binary text classification
    Altincay, Hakan
    Erenel, Zafer
    [J]. APPLIED INTELLIGENCE, 2014, 41 (01) : 310 - 326
  • [45] A novel text mining approach for scholar information extraction from web content in Chinese
    Xie, Xia
    Fu, Yu
    Jin, Hai
    Zhao, Yaliang
    Cao, Wenzhi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 859 - 872
  • [46] Correction to: Text feature extraction based on deep learning: a review
    Hong Liang
    Xiao Sun
    Yunlei Sun
    Yuan Gao
    [J]. EURASIP Journal on Wireless Communications and Networking, 2018
  • [47] Chinese Text Feature Extraction and Classification Based on Deep Learning
    Wang, Ruishuang
    Li, Zhao
    Cao, Jian
    Chen, Tong
    [J]. PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,
  • [48] Text feature extraction based on sparse balanced variational autoencoder
    Che, Lei
    [J]. Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2022, 44 (01): : 169 - 178
  • [49] Feature Extraction Based on the Independent Component Analysis for Text Classification
    Hu, Minghan
    Wang, Shijun
    Wang, Anhui
    Wang, Lei
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 296 - 300
  • [50] Ternary encoding based feature extraction for binary text classification
    Hakan Altınçay
    Zafer Erenel
    [J]. Applied Intelligence, 2014, 41 : 310 - 326