HierarchicalRank: Webpage Rank Improvement Using HTML']HTML TagLevel Similarity

被引:0
|
作者
Sharma, Dilip [1 ]
Ganeshiya, Deepak [1 ]
机构
[1] GLA Univ Mathura, Dept Comp Engn & Applicat, Mathura, Uttar Pradesh, India
关键词
Web mining; web graph; hyperlink analysis; connectivity; pagerank; !text type='HTML']HTML[!/text] tags;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the past researches, two types of algorithms are introduced that are query dependent and query independent, works online or offline. PageRank Algorithm works offline independent to query while Hyperlink-Induced Topic Search (HITS) algorithm woks online dependent on query. One of the problems of these algorithms is that, division of the rank is based on number of inlinks, outlinks and different parameters used in hyperlink analysis which is dependent or independent to webpage content with the problem of topic drift. Previous researches were focused to solve this problem using the popularity of the outlink webpages. In this paper a novel algorithm for popularity measure is proposed based on similarity between query and Hierarchical text extracted from source and target webpage using Hyper Text Markup Language (HTML) tags importance parameter. In this paper, result of proposed method is compared with PageRank Algorithm and Topic Distillation with Query Dependent Link Connections and Page Characteristics results.
引用
收藏
页码:485 / 492
页数:8
相关论文
共 50 条
  • [1] A stacking model using URL and HTML']HTML features for phishing webpage detection
    Li, Yukun
    Yang, Zhenguo
    Chen, Xu
    Yuan, Huaping
    Liu, Wenyin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 94 : 27 - 39
  • [2] HTML']HTML Block Similarity Estimation
    Griazev, Kiril
    Ramanausakite, Simona
    2018 IEEE 6TH WORKSHOP ON ADVANCES IN INFORMATION, ELECTRONIC AND ELECTRICAL ENGINEERING (AIEEE), 2018,
  • [3] Webpage stegano compression approach using attributes in html tags
    Al-Rababaa, M.S.
    Al-Nihoud, J.Q.
    International Review on Computers and Software, 2010, 5 (02) : 181 - 185
  • [4] SAS® and HTML']HTML -: HTML']HTML publishing using SAS
    Bahler, C
    Muller, S
    Doolittle, D
    Barrios, A
    PROCEEDINGS OF THE TWENTY-THIRD ANNUAL SAS USERS GROUP INTERNATIONAL CONFERENCE, 1998, : 229 - 237
  • [5] Special edition using HTML']HTML and XHTML']HTML
    Hawley, T
    TECHNICAL COMMUNICATION, 2003, 50 (02) : 288 - 290
  • [6] The discovery laboratory using HTML']HTML
    Lamba, RS
    DelaCuetara, R
    Sharma, SP
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1996, 211 : 102 - CHED
  • [7] Reinforced Disentangled HTML']HTML Representation Learning with Hard-Sample Mining for Phishing Webpage Detection
    Yoon, Jun-Ho
    Buu, Seok-Jun
    Kim, Hae-Jung
    ELECTRONICS, 2025, 14 (06):
  • [8] Comparing Similarity of HTML']HTML Structures and Affiliate IDs in Splog Analysis
    Katayama, Taichi
    Morijiri, Akihito
    Ishii, Soichi
    Utsuro, Takehito
    Kawada, Yasuhide
    Fukuhara, Tomohiro
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2011, 2011, 6637 : 378 - 389
  • [9] A Novel Phishing Page Detection Mechanism Using HTML']HTML Source Code Comparison and Cosine Similarity
    Roopak, S.
    Thomas, Tony
    2014 FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC), 2014, : 167 - 170
  • [10] Transcoding HTML']HTML to VoiceXML using annotation
    Shao, ZY
    Capra, R
    Pérez-Quiñones, MA
    15TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, : 249 - 258