An Evolutionary-based Random Weight Networks with Taguchi Method for Arabic Web Pages Classification

被引:3
|
作者
Shawabkeh, Arwa [1 ]
Faris, Hossam [1 ]
Aljarah, Ibrahim [1 ]
Abu-Salih, Bilal [1 ]
Alboaneen, Dabiah [2 ]
Alhindawi, Nouh [3 ]
机构
[1] Univ Jordan, King Abdullah II Sch Informat Technol, Amman, Jordan
[2] Imam Abdulrahman Bin Faisal Univ, Coll Sci & Humanities Jubail, Comp Sci Dept, POB 31961, Jubail Ind City, Saudi Arabia
[3] Jadara Univ, Fac Sci & Informat Technol, Amman, Jordan
关键词
Binary particle swarm optimization; Taguchi method; Random weight network; Arabic documents; Web pages classification;
D O I
10.1007/s13369-020-05301-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Nowadays, a huge number of web documents are available on the Internet, which makes the retrieval process of a specific topic very difficult, where some irrelevant pages may be retrieved as well. The automatic classification of web documents and pages has an essential application in different domains such as medicine, health, science, and information technology. A large number of web pages classification methods have been proposed to improve the search capabilities, especially in English language. In addition, the current classification methods attempt to classify the English web pages, and at the same time to reduce the high dimensionality of features extracted from these web pages. Due to the lack of classification methods for other languages, this paper focuses on Arabic web pages classification according to its scarcity as well as the importance of the Arabic language. In particular, we propose an evolutionary model based on binary particle swarm optimization (BPSO) combined with random weight networks (RWNs) as an induction algorithm to reduce the high dimensionality of features in the Arabic web pages and to perform document classification automatically. The datasets used in this paper were collected from popular Arabic websites. We collected three different datasets relating to three different fields, namely Computer Science, Science, and Health. Further, Taguchi method is incorporated to locate the best parameters of the proposed algorithm. The experimental results showed that the proposed model gives better performance results for Arabic web pages classification. In addition, an analysis study was conducted to identify the most important features learned from the proposed model as well as the most important tags. The results showed that list tag has obtained the highest percentage, which reflect its effectiveness on the classification of Arabic web pages.
引用
收藏
页码:3955 / 3980
页数:26
相关论文
共 50 条
  • [1] An Evolutionary-based Random Weight Networks with Taguchi Method for Arabic Web Pages Classification
    Arwa Shawabkeh
    Hossam Faris
    Ibrahim Aljarah
    Bilal Abu-Salih
    Dabiah Alboaneen
    Nouh Alhindawi
    [J]. Arabian Journal for Science and Engineering, 2021, 46 : 3955 - 3980
  • [2] Evolutionary-based packets classification for anomaly detection in web layer
    Kozik, Rafal
    Choras, Michal
    Holubowicz, Witold
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2016, 9 (15) : 2901 - 2910
  • [3] A structure and evolutionary-based classification of solute carriers
    Ferrada, Evandro
    Superti-Furga, Giulio
    [J]. ISCIENCE, 2022, 25 (10)
  • [4] A voting method for the classification of web pages
    Fang, Rui
    Mikroyannidis, Alexander
    Theodoulidis, Babis
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS PROCEEDINGS, 2006, : 610 - +
  • [5] A New Method to Weight Web Pages Based on Authority Changing
    Huang, Jian-cai
    [J]. PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING SYSTEMS, 2009, : 686 - 687
  • [6] Fuzzy Clustering Method for Web User Based on Pages Classification
    ZHAN Li-qiang 1
    2.Department of Science of Computer
    [J]. Wuhan University Journal of Natural Sciences, 2004, (05) : 553 - 556
  • [7] Evolutionary-based selection of generalized instances for imbalanced classification
    Garcia, Salvador
    Derrac, Joaquin
    Triguero, Isaac
    Carmona, Cristobal J.
    Herrera, Francisco
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) : 3 - 12
  • [8] A Comparative Study of Web-pages Classification Methods using Fuzzy Operators Applied to Arabic Web-pages
    Al-Taani, Ahmad T.
    Al-Awad, Noor Aldeen K.
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 33 - 35
  • [9] Optimal Evolutionary-Based Deployment of Mobile Sensor Networks
    Shafiabady, Niusha
    Nekoui, Mohammad Ali
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PROCEEDINGS, 2009, 5855 : 563 - +
  • [10] A comparative study of web-pages classification methods using fuzzy operators applied to arabic web-pages
    Al-Taani, AT
    Al-Awad, NAK
    [J]. ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 33 - 35