NMF based Dimension Reduction Methods for Turkish Text Clustering

被引:0
|
作者
Guran, Aysun [1 ]
Ganiz, Murat Can [1 ]
Naiboglu, Hamit Selahattin [1 ]
Kaptikacti, Halil Oguz [1 ]
机构
[1] Dogus Univ, Dept Comp Engn, TR-34722 Istanbul, Turkey
来源
2013 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (IEEE INISTA) | 2013年
关键词
component; Turkish text clustering; k-means; dimension reduction; NMF; NMF based text summarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] WordNet-based text clustering methods: Evaluation and comparative study
    Amine, A.
    Elberrichi, Z.
    Simonet, M.
    International Review on Computers and Software, 2009, 4 (02) : 220 - 228
  • [42] Simultaneous Interaction with Dimension Reduction and Clustering Projections
    Wenskovitch, John
    Dowling, Michelle
    North, Chris
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES: COMPANION (IUI 2019), 2019, : 89 - 90
  • [43] On Joint Dimension Reduction and Clustering of Categorical Data
    D'Enza, Alfonso Iodice
    Van de Velden, Michel
    Palumbo, Francesco
    ANALYSIS AND MODELING OF COMPLEX DATA IN BEHAVIORAL AND SOCIAL SCIENCES, 2014, : 161 - 169
  • [44] An Effective Class-centroid-based Dimension Reduction Method for Text Classification
    Pang, Guansong
    Jin, Huidong
    Jiang, Shengyi
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 223 - 224
  • [45] Tree cluster of text data by NMF based neural network
    Barman, Paresh Chandra
    Lee, Soo-Young
    ICECE 2006: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, 2006, : 312 - +
  • [46] Cross-correlation based clustering and dimension reduction of multivariate time series
    Egri, Attila
    Horvath, Illes
    Kovacs, Ferenc
    Molontay, Roland
    Varga, Krisztian
    2017 IEEE 21ST INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS (INES), 2017, : 241 - 246
  • [47] Manifold dimension reduction based clustering for multi-objective evolutionary algorithm
    Sun, Yanan
    Yen, Gary G.
    Mao, Hua
    Yi, Zhang
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 3785 - 3792
  • [48] Feature Extraction Based on Dimension Reduction and Clustering for Maize Leaf Spot Images
    Wang, Xue
    Xie, Qiuju
    Ma, Tiemin
    Zhu, Jingfu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (12)
  • [49] NMF-based Models for Tumor Clustering: A Systematic Comparison
    Zhang, Zhong-Yuan
    OPTIMIZATION AND SYSTEMS BIOLOGY, 2009, 11 : 41 - 47
  • [50] Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension
    Liu, Yuanchao
    Liu, Ming
    Wang, Xin
    PLOS ONE, 2015, 10 (03):