Web Page Classification Using RNN

被引:15
|
作者
Buber, Ebubekir [1 ]
Diri, Banu [1 ]
机构
[1] Yildiz Tech Univ, Comp Engn Dept, Istanbul, Turkey
关键词
web page classification; classification; categorization; deep learning; RNN; transfer learning;
D O I
10.1016/j.procs.2019.06.011
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Web page classification is an information retrieval application that provides useful information that can be a basis for many different application domains. In this study, a deep learning-based system has been developed for the classification of web pages. The meta tag information contained in the web page is used to classify a web page. The meta tags used are title, description and keywords. RNN based deep learning architecture was used during the tests. Transfer learning is the name given to the approach to building a machine learning model with the use of pre-trained parameters to solve a problem. The effect of using transfer learning on the system has also been examined. According to the results obtained, success rate of web page classification system is approximately 85%. It is not observed that transfer learning has significant contribution to the success rates. However, the use of transfer learning has reduced the consumed system resources. (C) 2019 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
  • [21] Ensemble approach for web page classification
    Amit Gupta
    Rajesh Bhatia
    Multimedia Tools and Applications, 2021, 80 : 25219 - 25240
  • [22] Web page classification based on SVM
    Xue, Weimin
    Bao, Hong
    Xue, Weimin
    Huang, Weitong
    Lu, Yuchang
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 6111 - +
  • [23] Web Page Classification with Social Annotations
    Zubiaga, Arkaitz
    Martinez, Raquel
    Fresno, Victor
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2009, (43): : 225 - 233
  • [24] Web Page Classification in Specific Domain
    Rangel Pardo, Francisco Manuel
    Penas Padilla, Anselmo
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 89 - 96
  • [25] Web Page Classification: Features and Algorithms
    Qi, Xiaoguang
    Davison, Brian D.
    ACM COMPUTING SURVEYS, 2009, 41 (02)
  • [26] Studies on Chinese web page classification
    Shen, D
    Cong, Y
    Sun, JT
    Lu, YC
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 23 - 27
  • [27] Prevalence and classification of web page defects
    Ofuonye, Ejike
    Beatty, Patricia
    Dick, Scott
    Miller, James
    ONLINE INFORMATION REVIEW, 2010, 34 (01) : 160 - 174
  • [28] Recognition of common areas in a web page using visual information: A possible application in a page classification
    Kovacevic, M
    Diligenti, M
    Gori, M
    Milutinovic, V
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 250 - 257
  • [29] Web Page Classification on News Feeds Using Hybrid Technique for Extraction
    Patel, Ankit Dilip
    Sharma, Yogesh Kumar
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS, ICTIS 2018, VOL 2, 2019, 107 : 399 - 405
  • [30] Web Page Classification Using Relational Learning Algorithm and Unlabeled Data
    Li, Yanjuan
    Guo, Maozu
    JOURNAL OF COMPUTERS, 2011, 6 (03) : 474 - 479