A Markov Prediction Model Based on Page Hierarchical Clustering

被引:3
|
作者
Yao, Yao [1 ,2 ]
Shi, Lei [1 ,2 ]
Wang, Zhanhong [3 ]
机构
[1] Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Henan, Peoples R China
[2] Henan Prov Key Lab Informat Network, Zhengzhou, Peoples R China
[3] Xinyang Normal Univ, Dept Comp Sci, Xinyang, Peoples R China
关键词
Markov prediction model; Website link hierarchy structure (WLH); Website conceptual link hierarchy (WLC); Link similarity; clustering;
D O I
10.1080/15501320802575062
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Markov prediction model is the basis of Web prefetching and personalized recommendation. It can be used to extract connotative Web link hierarchy. The visualized site structure can not only help users understand the relationships between the pages they have visited, but also suggest where they can go next. But the existence of a large amount of Web objects results in data redundancy and model hugeness. Therefore, how to mine and improve the link structure of a website has become a chief problem and it has positive meanings for prefetching. This paper presents an improved method that simplifies the topology structure of a website and extracted the conceptual link hierarchy which can make the organization clearly and legibly. First, the Markov Tree is constructed for the reason that a more capable mechanism for representing past activity in a form usable for prediction is a Markov Tree. In this case the Markov chain model can be defined as a three-tuple (A, S, P), where A is the collection of operation, S is the state space consisting of all the states in a link structure, and P is the one-step transition probability matrix. The transition probability matrix is calculated based on the Markov tree. Second, an algorithm is given to extract the hierarchical tree from the above matrix. The website link hierarchy (WLH) is obtained accordingly. A WLH only contains a trunk link which is a hyperlink from a page on a higher conceptual level to a page on its adjacent lower conceptual level. With the levels increment, there must be more and more pages in each level. It may blur the structure of the website. In order to tackle the problem, a clustering algorithm is proposed to cluster conceptually-related pages on same levels based on their in-link and out-link similarities, which are measured by the concept of weighted Euclidean distance. After the pages in WLH have been clustered, WLC can be constructed. Finally, the simplified model will be used for Web page prediction. Three parameters, i.e. precision, recall, and PRS have been employed to measure the performance in the experiments. Experiments based on two real Web log data demonstrate the efficiency of the proposed method, which can not only have good overall performance and clustering effect but also keep the relative higher prediction accuracy and recall.
引用
收藏
页码:89 / 89
页数:1
相关论文
共 50 条
  • [31] Model-Based Hierarchical Clustering for Categorical Data
    Alalyan, Fahdah
    Zamzami, Nuha
    Bouguila, Nizar
    2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 1424 - 1429
  • [32] Software reliability analysis of Hierarchical architecture based on Markov model
    Wei, Ying
    Wang, Libo
    Wang, MingQian
    CEIS 2011, 2011, 15
  • [33] Web Page Rank Prediction with PCA and EM Clustering
    Zacharouli, Polyxeni
    Titsias, Michalis
    Vazirgiannis, Michalis
    ALGORITHMS AND MODELS FOR THE WEB-GRAPH, PROCEEDINGS, 2009, 5427 : 104 - +
  • [34] TEXTURE SEGMENTATION BASED ON A HIERARCHICAL MARKOV RANDOM FIELD MODEL
    HU, RM
    FAHMY, MM
    SIGNAL PROCESSING, 1992, 26 (03) : 285 - 305
  • [35] Road Surface Marking Classification Based on a Hierarchical Markov Model
    Ammar, Moez
    Le Hegarat-Mascle, Sylvie
    Mounier, Hugues
    IMAGE ANALYSIS AND RECOGNITION: 8TH INTERNATIONAL CONFERENCE, ICIAR 2011, PT II: 8TH INTERNATIONAL CONFERENCE, ICIAR 2011, 2011, 6754 : 348 - 359
  • [36] Texture segmentation based on a hierarchical Markov model in wavelet domain
    Liu, Guoying
    Mao, Lifei
    Wang, Leiguang
    Qin, Qianqing
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/ Geomatics and Information Science of Wuhan University, 2009, 34 (05): : 531 - 534
  • [37] Multispectral remote sensing images optimization hierarchical clustering based on markov random field
    Liu, Xiao-Yun
    Chen, Wu-Fan
    Wang, Zhen-Song
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2007, 36 (04): : 400 - 405
  • [38] Markov model-based clustering for efficient patient care
    McClean, S
    Faddy, M
    Millard, P
    18th IEEE Symposium on Computer-Based Medical Systems, Proceedings, 2005, : 467 - 472
  • [39] Web Page Prediction by Clustering and Integrated Distance Measure
    Poornalatha, G.
    Raghavendra, Prakash S.
    2012 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2012, : 1349 - 1354
  • [40] Fuzzy Similarity-Based Hierarchical Clustering for Atmospheric Pollutants Prediction
    Camastra, F.
    Ciaramella, A.
    Son, L. H.
    Riccio, A.
    Staiano, A.
    FUZZY LOGIC AND APPLICATIONS, WILF 2018, 2019, 11291 : 123 - 133