Hybrid Dimensionality Reduction Approach for Web Page Classification

被引:0
|
作者
Sarode, Shraddha [1 ]
Gadge, Jayant [2 ]
机构
[1] Thadomal Shahani Engn Coll, Comp Engn ME, Bombay, Maharashtra, India
[2] Thadomal Shahani Engn Coll, Dept Comp Engn, Bombay, Maharashtra, India
关键词
Dimensionality Reduction; Feature Selection; Information gain; Naive Bayes; Rough Set; Web Page Classification;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Today there is huge amount of data available on World Wide Web. One way to manage data is web page classification. One of the issues of web page classification considered in this paper is high dimensionality. Dimensionality refers to number of terms in a web page. High dimensionality of web pages causes problem while classifying them. The main objective of reducing dimensionality of web pages is to improve the performance of the classifier. This paper describes hybrid approach of dimensionality reduction for web page classification using a rough set and information gain method. Feature selection and dimensionality reduction methods are used to reduce the dimensionality of web pages. Information gain method is used as feature selection method. Rough set based Quick Reduct algorithm is used for dimensionality reduction. Web pages are classified using naive Bayesian method. Significant results are obtained and tested for proposed architecture.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Automatic Web Page Classification
    Materna, Jiri
    [J]. RASLAN 2008: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING: SECOND WORKSHOP, 2008, : 84 - 93
  • [32] Web page genre classification
    Computer Science, Louisiana Tech University, LA 71272, United States
    [J]. Proc ACM Symp Appl Computing, (2353-2357):
  • [33] On Chinese web page classification
    Liang, JZ
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 634 - 639
  • [34] A NOVEL DIMENSIONALITY REDUCTION APPROACH TO IMPROVE MICROARRAY DATA CLASSIFICATION
    Hamim, Mohammed
    El Mouden, Ismail
    Ouzir, Mounir
    Moutachaouik, Hicham
    Hain, Mustapha
    [J]. IIUM ENGINEERING JOURNAL, 2021, 22 (01): : 1 - 23
  • [35] An adaptive semantic dimensionality reduction approach for hyperspectral imagery classification
    Hamdi, Rawaa
    Sellami, Akrem
    Farah, Imed Riadh
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [36] An approach to text classification using dimensionality reduction and combination of classifiers
    Jain, G
    Ginwala, A
    Aslandogan, YA
    [J]. PROCEEDINGS OF THE 2004 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI-2004), 2004, : 564 - 569
  • [37] Evaluation of dimensionality reduction techniques on hybrid CNN–based HSI classification
    Satyajit Swain
    Anasua Banerjee
    [J]. Arabian Journal of Geosciences, 2021, 14 (24)
  • [38] The Role of Dimensionality Reduction in Classification
    Wang, Weiran
    Carreira-Perpinan, Miguel A.
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2128 - 2134
  • [39] Dimensionality Reduction for Ordinal Classification
    Zine-El-Abidine, Mouad
    Dutagaci, Helin
    Rousseau, David
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1531 - 1535
  • [40] Classification constrained dimensionality reduction
    Costa, JA
    Hero, AO
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1077 - 1080