Creating streaming iterative soft clustering algorithms

被引:7
|
作者
Hore, Prodip [1 ]
Hall, Lawrence O. [1 ]
Goldgof, Dmitry B. [1 ]
机构
[1] Univ S Florida, Dept Comp Sci & Engn, ENB118,4202 E Fowler Ave, Tampa, FL 33620 USA
基金
美国国家卫生研究院;
关键词
fuzzy; possibilistic; clustering; streaming; scalable;
D O I
10.1109/NAFIPS.2007.383888
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are an increasing number of large labeled and unlabeled data sets available. Clustering algorithms are the best suited for helping one make sense out of unlabeled data. However, scaling iterative clustering algorithms to large amounts of data has been a challenge. The computation time can be very great and for data sets that will not fit in even the largest memory, only carefully chosen subsets of data can be practically clustered. We present a general approach which enables iterative fuzzy/possibilistic clustering algorithms to be turned into algorithms that can handle arbitrary amounts of streanting data. The computation time is also reduced for very large data sets while the results of clustering will be very similar to clustering with all the data, if that was possible. We introduce transformed equations for fuzzy-c-means, possibilistic c-means, the Gustafson-Kessel algorithm and show the excellent performance with a streaming fuzzy c-means implementation. The resulting clusters are both sensible and for comparable data sets (those that fit in memory) almost identical to those obtained with the original clustering algorithm.
引用
收藏
页码:484 / +
页数:2
相关论文
共 50 条
  • [1] Massively parallel and streaming algorithms for balanced clustering
    Mirjalali, Kian
    Zarrabi-Zadeh, Hamid
    [J]. THEORETICAL COMPUTER SCIENCE, 2024, 983
  • [2] Iterative big data clustering algorithms: a review
    Mohebi, Amin
    Aghabozorgi, Saeed
    Teh Ying Wah
    Herawan, Tutut
    Yahyapour, Ramin
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2016, 46 (01): : 107 - 129
  • [3] Correlative Analysis of Soft Clustering Algorithms
    Rajathi, S.
    Shajunisha, N.
    Caroline, S. Shiny
    [J]. 2013 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2013, : 360 - 365
  • [4] Streaming clustering algorithms for foreground detection in color videos
    Duric, Zoran
    Lawson, Wallace E.
    Richards, Dana
    [J]. VISAPP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOLUME IU/MTSV, 2007, : 486 - +
  • [5] Streaming Algorithms and Lower Bounds for Estimating Correlation Clustering Cost
    Assadi, Sepehr
    Shah, Vihan
    Wang, Chen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] On fly hybrid swarm optimization algorithms for clustering of streaming data
    Gowda, N. Yashaswini
    Lakshmikantha, B. R.
    [J]. RESULTS IN CONTROL AND OPTIMIZATION, 2023, 10
  • [7] Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity
    MeCutchen, Richard Matthew
    Khuller, Samir
    [J]. APPROXIMATION RANDOMIZATION AND COMBINATORIAL OPTIMIZATION: ALGORITHMS AND TECHNIQUES, PROCEEDINGS, 2008, 5171 : 165 - 178
  • [8] Streaming-data algorithms for high-quality clustering
    O'Callaghan, L
    Mishra, N
    Meyerson, A
    Guha, S
    Motwani, R
    [J]. 18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 685 - 694
  • [9] Soft clustering algorithms based on neural networks
    Institute of Informatics, Slovak Academy of Sciences, Dúbravská cesta 9, 845 07 Bratislava 45, Slovakia
    不详
    [J]. IEEE Int. Symp. Comput. Intell. Informatics, CINTI - Proc., (439-442):
  • [10] Performance Evaluation of Mahout Clustering Algorithms Using a Twitter Streaming Dataset
    Xhafa, Fatos
    Bogza, Adriana
    Caballe, Santi
    [J]. 2017 IEEE 31ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2017, : 1019 - 1026