A proposed multi criteria indexing and ranking model for documents and web pages on large scale data

被引:3
|
作者
Attia, Mohamed [1 ]
Abdel-Fattah, Manal A. [2 ]
Khedr, Ayman E. [1 ]
机构
[1] Future Univ Egypt, Cairo, Egypt
[2] Helwan Univ, Helwan, Egypt
关键词
Page ranking; User preferences; Crawling; Multi -criteria index; Retrieval process; Search Engine; SEARCH ENGINE OPTIMIZATION; USER;
D O I
10.1016/j.jksuci.2021.10.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the expansion of data, search engines encounter different obstacles for retrieving better relevant content to user's search queries. Consequently, various retrieval and ranking algorithms have been applied to satisfy the result's relevancy according to user's needs. Unfortunately, indexing and ranking processes face several challenges to achieve highly accurate results, since most of the existing indexes and ranking algorithms crawl documents and web pages based on limited number of criteria that satisfy user needs. So, this research studies and observes how search engines work and which factors contribute to higher rankings results. The research also proposes a Multi Criteria Indexing and Ranking Model (MCIR) based on weighted documents and pages which depend on one or more ranking factors, aiming at building a model that achieves high performance, better relevant pages, and the ability to index and rank both online/offline pages and documents. The MCIR model was applied on three different experi-ments to compare documents and pages results in terms of ranking scores, based on one or more criteria of user's preferences. The results of applying MCIR model proved that final pages ranking results depend-ing on multi-criteria are better than using only one criterion, and some criteria have great effect on rank-ing results than other criteria. It was also observed that the MCIR model achieved high performance on indexing and ranking dataset up to 100 gigabytes.(c) 2021 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:8702 / 8715
页数:14
相关论文
共 50 条
  • [1] Voting model for ranking Web pages
    Lifantsev, M
    [J]. IC'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, 2000, : 143 - 148
  • [2] Indexing and querying segmented web pages: the BlockWeb Model
    Bruno, Emmanuel
    Faessel, Nicolas
    Glotin, Herve
    Le Maitre, Jacques
    Scholl, Michel
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2011, 14 (5-6): : 623 - 649
  • [3] Indexing and querying segmented web pages: the BlockWeb Model
    Emmanuel Bruno
    Nicolas Faessel
    Hervé Glotin
    Jacques Le Maitre
    Michel Scholl
    [J]. World Wide Web, 2011, 14 : 623 - 649
  • [4] High Throughput Indexing for Large-scale Semantic Web Data
    Cheng, Long
    Kotoulas, Spyros
    Ward, Tomas E.
    Theodoropoulos, Georgios
    [J]. 30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 416 - 422
  • [5] Effective Model And Implementation Of Dynamic Ranking In Web Pages
    Divjot
    Singh, Jaiteg
    [J]. 2015 FIFTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT2015), 2015, : 1010 - 1014
  • [6] Page Ranking Algorithms in Web Mining, Limitations of Existing methods and a New Method for Indexing Web Pages
    Jain, Ashish
    Sharma, Rajeev
    Dixit, Gireesh
    Tomar, Varsha
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT 2013), 2013, : 640 - 645
  • [7] Toward Large Scale Data-Aware Search: Ranking, Indexing, Resolution and Beyond
    Cheng, Tao
    Chang, Kevin Chen-Chuan
    [J]. 2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 297 - 300
  • [8] A Data Indexing Technique to Improve the Search Latency of AND Queries for Large Scale Textual Documents
    Mohideen, Abdulla Kalandar
    Majumdar, Shikharesh
    St-Hilaire, Marc
    El-Haraki, Ali
    [J]. 2020 IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT 2020), 2020, : 37 - 46
  • [9] A novel web ranking algorithm based on pages multi-attribute
    Baker M.R.
    Akcayol M.A.
    [J]. International Journal of Information Technology, 2022, 14 (2) : 739 - 749
  • [10] LEARNING TO CLASSIFY TROPICAL DISEASE WEB PAGES FROM LARGE INDONESIAN WEB DOCUMENTS
    Abidin, Taufik Fuadi
    Ferdhiana, Ridha
    Kamil, Hajjul
    [J]. FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING (ICCEE 2011), 2011, : 347 - +