Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm

被引：35

作者：

Park, Sungjune ^{[1
]}

Suresh, Nallan C. ^{[2
]}

Jeong, Bong-Keun ^{[1
]}

机构：

[1] Univ N Carolina, Business Informat Syst & Operat Management, Belk Coll Business, Charlotte, NC 28223 USA

[2] SUNY Buffalo, Dept Operat Management & Strategy, Sch Management, Buffalo, NY 14260 USA

来源：

DATA & KNOWLEDGE ENGINEERING | 2008年 / 65卷 / 03期

关键词：

Web usage mining; clustering methods; simulation; artificial intelligence; Markov chain;

D O I：

10.1016/j.datak.2008.01.002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We develop a general sequence-based clustering method by proposing new sequence representation schemes in association with Markov models. The resulting sequence representations allow for calculation of vector-based distances (dissimilarities) between Web user sessions and thus can be used as inputs of various clustering algorithms. We develop an evaluation framework in which the performances of the algorithms are compared in terms of whether the clusters (groups of Web users who follow the same Markov process) are correctly identified using a replicated clustering approach. A series of experiments is conducted to investigate whether clustering performance is affected by different sequence representations and different distance measures as well as by other factors such as number of actual Web user clusters, number of Web pages, similarity between clusters, minimum session length, number of user sessions, and number of clusters to form. A new, fuzzy ART-enhanced K-means algorithm is also developed And its superior performance is demonstrated. Published by Elsevier B.V.

引用

页码：512 / 543

页数：32

共 50 条

[1] A new k-means based clustering algorithm in aspect mining
Serban, Gabriela
Moldovan, Grigoreta Sofia
[J]. SYNASC 2006: EIGHTH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PROCEEDINGS, 2007, : 69 - +
[2] A Modified K-means Algorithm for Sequence Clustering
Hsu, Jia-Lien
Yang, Hong-Xiang
[J]. HIS 2009: 2009 NINTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, VOL 1, PROCEEDINGS, 2009, : 287 - 292
[3] Application of Density-based Adaptive K-Means Clustering Algorithm in Web Log Mining
Guo, Guang Nan
Yun, Yong Gang
Chu Mei
Shi, Hong Yan
Yin, Ke Gong
[J]. MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 5152 - 5156
[4] Efficient enhanced k-means clustering algorithm
Fahim A.M.
Salem A.M.
Torkey F.A.
Ramadan M.A.
[J]. Journal of Zhejiang University-SCIENCE A, 2006, 7 (10): : 1626 - 1633
[5] A K-means Clustering Algorithm Based on Enhanced Differential Evolution
Mao, Li
Gong, Huaijin
Liu, Xingyang
[J]. ADVANCED MANUFACTURING SYSTEMS, 2011, 339 : 71 - 75
[6] An efficient enhanced k-means clustering algorithm
FAHIM A.M
SALEM A.M
TORKEY F.A
RAMADAN M.A
[J]. Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2006, (10) : 1626 - 1633
[7] A k-means based clustering algorithm
Bloisi, Domenico Daniele
Locchi, Luca
[J]. COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
[8] Parallelization of K-Means Clustering Algorithm for Data Mining
Jiang, Hao
Yu, Liyan
[J]. 4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
[9] k*-means:: A new generalized k-means clustering algorithm
Cheung, YM
[J]. PATTERN RECOGNITION LETTERS, 2003, 24 (15) : 2883 - 2893
[10] An Improved K-means Algorithm for DNA Sequence Clustering
Aleb, Nassima
Labidi, Narimane
[J]. 2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 39 - 42

← 1 2 3 4 5 →