Using Machine Learning for Web Page Classification in Search Engine Optimization

被引:19
|
作者
Matosevic, Goran [1 ]
Dobsa, Jasminka [2 ]
Mladenic, Dunja [3 ]
机构
[1] Univ Pula, Fac Econ & Tourism Dr Mijo Mirkovic, Pula 52100, Croatia
[2] Univ Zagreb, Fac Org & Informat Varazdin, Zagreb 10000, Croatia
[3] Inst Jozes Stefan Ljubljana, Ljubljana 1000, Slovenia
来源
FUTURE INTERNET | 2021年 / 13卷 / 01期
关键词
search engine optimization; SEO optimization; on-page optimization; classification; machine learning; RANK;
D O I
10.3390/fi13010009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel approach of using machine learning algorithms based on experts' knowledge to classify web pages into three predefined classes according to the degree of content adjustment to the search engine optimization (SEO) recommendations. In this study, classifiers were built and trained to classify an unknown sample (web page) into one of the three predefined classes and to identify important factors that affect the degree of page adjustment. The data in the training set are manually labeled by domain experts. The experimental results show that machine learning can be used for predicting the degree of adjustment of web pages to the SEO recommendations-classifier accuracy ranges from 54.59% to 69.67%, which is higher than the baseline accuracy of classification of samples in the majority class (48.83%). Practical significance of the proposed approach is in providing the core for building software agents and expert systems to automatically detect web pages, or parts of web pages, that need improvement to comply with the SEO guidelines and, therefore, potentially gain higher rankings by search engines. Also, the results of this study contribute to the field of detecting optimal values of ranking factors that search engines use to rank web pages. Experiments in this paper suggest that important factors to be taken into consideration when preparing a web page are page title, meta description, H1 tag (heading), and body text-which is aligned with the findings of previous research. Another result of this research is a new data set of manually labeled web pages that can be used in further research.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [1] Entity-Based Classification of Web Page in Search Engine
    Liu, Yicen
    Liu, Mingrong
    Xiang, Liang
    Yang, Qing
    [J]. Digital Libraries: Universal and Ubiquitous Access to Information, Proceedings, 2008, 5362 : 410 - 411
  • [2] Using anchor text to improve web page title in process of search engine optimization
    Matosevic, Goran
    [J]. CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS, 2015, 2015, : 173 - 176
  • [3] Efficient Machine Learning Technique for Web Page Classification
    S. Markkandeyan
    M. Indra Devi
    [J]. Arabian Journal for Science and Engineering, 2015, 40 : 3555 - 3566
  • [4] A review of machine learning algorithms for web page classification
    Lassri, Safae
    El Habib, Benlahmar
    Abderrahim, Tragha
    [J]. 2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 220 - 226
  • [5] Efficient Machine Learning Technique for Web Page Classification
    Markkandeyan, S.
    Devi, M. Indra
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2015, 40 (12) : 3555 - 3566
  • [6] Web Page Classification Using Firefly Optimization
    Sarac, Esra
    Ozel, Selma Ayse
    [J]. 2013 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (IEEE INISTA), 2013,
  • [7] Machine learning techniques for automated web page classification using URL features
    Devi, M. Indra
    Rajaram, R.
    Selvakuberan, K.
    [J]. ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL II, PROCEEDINGS, 2007, : 116 - 118
  • [8] Web page clustering using Harmony Search optimization
    Forsati, Rana
    Mahdavi, Mehrdad
    Kangavari, Mohammadreza
    Safarkhani, Banafsheh
    [J]. 2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 1530 - +
  • [9] WEB PAGE RANKING USING MACHINE LEARNING APPROACH
    Chauhan, Vijay
    Jaiswal, Arunima
    Khan, Junaid Khalid
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION TECHNOLOGIES ACCT 2015, 2015, : 575 - 580
  • [10] Insights into Search Engine Optimization using Natural Language Processing and Machine Learning
    Vinutha, M. S.
    Padma, M. C.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (02) : 86 - 96