A similarity-based soft clustering algorithm for documents

被引：0

作者：

Lin, KI ^{[1
]}

Kondadadi, R ^{[1
]}

机构：

[1] Memphis State Univ, Dept Math Sci, Memphis, TN 38152 USA

来源：

SEVENTH INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS | 2001年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Document clustering is an important tool for applications such as Web search engines. Clustering documents enables the user to have a good overall view of the information contained in the documents that he has. However, existing algorithms suffer from various aspects: hard clustering algorithms (where each document belongs to exactly one cluster) cannot detect the multiple themes of a document, while soft clustering algorithms (where each document can belong to multiple clusters) are usually inefficient. We propose SISC (SImilarity-based Soft Clustering), an efficient soft clustering algorithm based on a given similarity measure. SISC required only a similarity measure for clustering and uses randomization to help make the clustering efficient. Comparison with existing hard clustering algorithms like K-means and its variants shows that SISC is both effective and efficient.

引用

页码：40 / 47

页数：2

共 50 条

[1] Similarity-based soft clustering algorithm for web documents
School of Remote Sensing Information Engineering, Wuhan University, Wuhan 430079, China
Jisuanji Gongcheng, 2006, 2 (59-61):
[2] A Similarity-Based Clustering Algorithm for Fuzzy Data
Hung, Wen-Liang
Yang, Miin-Shen
2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010), 2010,
[3] Subspace Similarity-based Algorithm for Combine Multiple Clustering
Xu, Sen
Li, Xianfeng
Chen, Rong
Wu, Shuang
Ni, Jun
2013 SEVENTH INTERNATIONAL CONFERENCE ON INTERNET COMPUTING FOR ENGINEERING AND SCIENCE (ICICSE 2013), 2013, : 69 - 76
[4] An efficient similarity-based validity index for kernel clustering algorithm
Pu, Yun-Wei
Zhu, Ming
Jin, Wei-Dong
Hu, Lai-Zhao
ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1044 - 1049
[5] A clustering algorithm for short documents based on concept similarity
Peng, Jing
Yang, Dong-qing
Wang, Jian-wei
Wu, Meng-qing
Wang, Jun-gang
2007 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 42 - 45
[6] A word-based soft clustering algorithm for documents
Lin, KI
Kondadadi, R
COMPUTERS AND THEIR APPLICATIONS, 2001, : 391 - 394
[7] An Improved Similarity-Based Clustering Algorithm for Multi-Database Mining
Miloudi, Salim
Wang, Yulin
Ding, Wenjia
ENTROPY, 2021, 23 (05)
[8] Similarity-based chemical clustering techniques
Gute, BD
Basak, SC
Mills, D
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2005, 229 : U789 - U789
[9] Semantic Similarity-Based Clustering of Web Documents Using Fuzzy C-Means
Avanija, J.
Ramar, K.
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2015, 14 (03)
[10] A similarity-based robust clustering method
Yang, MS
Wu, KL
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (04) : 434 - 448

← 1 2 3 4 5 →