NMF based Dimension Reduction Methods for Turkish Text Clustering

被引:0
|
作者
Guran, Aysun [1 ]
Ganiz, Murat Can [1 ]
Naiboglu, Hamit Selahattin [1 ]
Kaptikacti, Halil Oguz [1 ]
机构
[1] Dogus Univ, Dept Comp Engn, TR-34722 Istanbul, Turkey
来源
2013 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (IEEE INISTA) | 2013年
关键词
component; Turkish text clustering; k-means; dimension reduction; NMF; NMF based text summarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Study on Matrix decomposition based Chinese Text Clustering Methods
    Li, Fang
    Zhu, Qunxiong
    PROCEEDINGS OF 2008 INTERNATIONAL COLLOQUIUM ON ARTIFICIAL INTELLIGENCE IN EDUCATION, 2008, : 98 - 102
  • [32] A Method of Text Dimension Reduction Based on CHI and TF-IDF
    Tang, HaiBo
    Zhou, Lei
    Xu Chengjie
    Zhu, Quanyin
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 1854 - 1857
  • [33] A Graph Form Data Stream Clustering Approach Based on Dimension Reduction
    Makul, Ozge
    Ekinci, Murat
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [34] Method of Feature Reduction in Short Text Classification Based on Feature Clustering
    Li, Fangfang
    Yin, Yao
    Shi, Jinjing
    Mao, Xingliang
    Shi, Ronghua
    APPLIED SCIENCES-BASEL, 2019, 9 (08):
  • [35] A comparative study on text clustering methods
    Zheng, Yan
    Cheng, Xiaochun
    Huang, Ronghuai
    Man, Yi
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 644 - 651
  • [36] Enhancing Effectiveness of Dimension Reduction in Text Classification
    Seyyedi, Seyyed Hossein
    Minaei-Bidgoli, Behrouz
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
  • [37] A survey on dimension reduction techniques in text classification
    Wang, Zhi Juan
    Zhou, Ruo Song
    COMPUTING, CONTROL, INFORMATION AND EDUCATION ENGINEERING, 2015, : 633 - 635
  • [38] Effective Dimension Reduction Techniques for Text Documents
    Ponmuthuramalingam, P.
    Devi, T.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (07): : 101 - 109
  • [39] Comparing dimension reduction techniques for document clustering
    Tang, B
    Shepherd, M
    Heywood, MI
    Luo, X
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, 3501 : 292 - 296
  • [40] On dimension reduction of clustering results in structural bioinformatics
    Ivan, Gabor
    Grolmusz, Vince
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2014, 1844 (12): : 2277 - 2283