Implementing GloVe for Context Based k-means plus plus Clustering

被引:0
|
作者
Gupta, Akanksha [1 ]
Tripathy, B. K. [1 ]
机构
[1] VIT Univ, Sch Comp Sci & Engn, Vellore, Tamil Nadu, India
关键词
k-means plus; NLP; GloVe; t-SNE; Word Embedding;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have implemented a unique form of clustering that takes a non-numeric data set and clusters it with the help of the word embedding provided by the GloVe dataset. The related word embedding are generated for each of the items in the dataset we want to cluster using the GloVe vector representation of those words. We then perform dimensionality reduction on the data set to obtain the accurate number of dimensions to be taken for appropriate cluster formation. The data is then clustered using k-means++. This paper provides one of the ways to overcome the limitation of k-means clustering in terms of initialising the cluster centres and hence gives better quality clusters. With the synthetic examples, the k-means method does not perform well, because the random seeding inevitably merges clusters together, and the algorithm is unable to then split them apart. Careful seeding method used by k-means++ prevents this problem and hence usually gives optimal results even when datasets are synthetic.
引用
收藏
页码:1041 / 1046
页数:6
相关论文
共 50 条
  • [41] A bad instance for k-means plus
    Brunsch, Tobias
    Roeglin, Heiko
    THEORETICAL COMPUTER SCIENCE, 2013, 505 : 19 - 26
  • [42] A Bad Instance for k-Means plus
    Brunsch, Tobias
    Roeglin, Heiko
    THEORY AND APPLICATIONS OF MODELS OF COMPUTATION, TAMC 2011, 2011, 6648 : 344 - 352
  • [43] Fast Scalable k-means plus plus Algorithm with MapReduce
    Xu, Yujie
    Qu, Wenyu
    Li, Zhiyang
    Ji, Changqing
    Li, Yuanyuan
    Wu, Yinan
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II, 2014, 8631 : 15 - 28
  • [44] Face hallucination with K-means plus plus dictionary learning
    Chen, Zhenxue
    Li, Jiadi
    Liu, Chengyun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (17-18) : 11685 - 11698
  • [45] Recombinator-k-Means: An Evolutionary Algorithm That Exploits k-Means plus plus for Recombination
    Baldassi, Carlo
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2022, 26 (05) : 991 - 1003
  • [46] Unveiling Patterns and Colors in Architectural Paintings: An Analysis by K-Means plus plus Clustering and Color Ratio Analysis
    Zhang, Liang
    Zhang, Yiqu
    Wei, Yumeng
    Zhang, Tao
    Zhang, Jian
    Xu, Jun
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2023, 30 (06): : 1870 - 1879
  • [47] Unsupervised learning of acoustic events using dynamic time warping and hierarchical K-means plus plus clustering
    Schmalenstroeer, Joerg
    Bartek, Markus
    Haeb-Umbach, Reinhold
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 312 - 315
  • [48] Image Forgery Detection based on SIFT and k-means plus
    Baykal, Elif
    Ustubioglu, Beste
    Ulutas, Guzin
    2016 39TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2016, : 474 - 477
  • [49] CAPKM++2.0: An upgraded version of the collaborative annealing power k-means plus plus clustering algorithm*
    Li, Hongzong
    Wang, Jun
    KNOWLEDGE-BASED SYSTEMS, 2023, 262
  • [50] A Method of Two Stage Clustering Using Agglomerative Hierarchical Algorithms with One-Pass k-Means plus plus or k-Median plus
    Tamura, Yusuke
    Miyamoto, Sadaaki
    2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2014, : 281 - 285