AN APPROACH FOR TEXT CLUSTERING USING MODIFIED K-MEANS ALGORITHM

被引:0
|
作者
Rose, J. Dafni [1 ]
Mukherjee, Saswati [1 ]
机构
[1] St Josephs Inst Technol, Dept Comp Sci & Engn, Madras, Tamil Nadu, India
关键词
Text clustering; K-means algorithm; Apriori algorithm;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the rapid expansion of internet, the digital data available evolving at a large pace day by day. This led to the need for effective regulation of the data available. The text data stored in digital libraries are in unstructured format. Thus the demand for easy retrieval, accessibility, and organization of text material has become an essential one. Among all the text mining methodologies available, clustering is one of the methods that is used for effective organization of data. In this paper an efficient K-means algorithm for clustering the text data is proposed. In this algorithm, a procedure to select the initial centroids is described. Dissimilar documents are selected as initial centroids for the K-means algorithm. The number of iterations taken to converge is shown to have improved. The experimental results show that the proposed algorithm improve the performance compared to the simple K-means algorithm.
引用
收藏
页码:243 / 247
页数:5
相关论文
共 50 条
  • [1] Modified K-means clustering algorithm
    Li, Wei
    [J]. CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 4, PROCEEDINGS, 2008, : 618 - 621
  • [2] Modified k-Means Clustering Algorithm
    Patel, Vaishali R.
    Mehta, Rupa G.
    [J]. COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2011, 250 : 307 - +
  • [3] Distributed Algorithm for Text Documents Clustering Based on k-Means Approach
    Sarnovsky, Martin
    Carnoka, Noema
    [J]. INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2015, PT II, 2016, 430 : 165 - 174
  • [4] DEM Fusion using a modified k-means clustering algorithm
    Fuss, Colleen E.
    Berg, Aaron A.
    Lindsay, John B.
    [J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2016, 9 (12) : 1242 - 1255
  • [5] Inverted Index based Modified Version of K-Means Algorithm for Text Clustering
    Jo, Taeho
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2008, 4 (02): : 67 - 76
  • [6] A Modified K-means Algorithm for Sequence Clustering
    Hsu, Jia-Lien
    Yang, Hong-Xiang
    [J]. HIS 2009: 2009 NINTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, VOL 1, PROCEEDINGS, 2009, : 287 - 292
  • [7] Modified K-Means Algorithm for Genetic Clustering
    Bonab, Mohammad Babrdel
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2011, 11 (09): : 24 - 28
  • [8] Modified moving k-means clustering algorithm
    Alias, Mohd Fauzi
    Isa, Nor Ashidi Mat
    Sulaiman, Siti Amrah
    Mohamed, Mahaneem
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2012, 16 (02) : 79 - 86
  • [9] Chinese text clustering algorithm based k-means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    [J]. 2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
  • [10] Weighted k-Means Algorithm Based Text Clustering
    Chen, Xiuguo
    Yin, Wensheng
    Tu, Pinghui
    Zhang, Hengxi
    [J]. IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 51 - +