Clustering High-Dimensional Stock Data using Data Mining Approach

被引:0
|
作者
Indriyanti, Dhea [1 ]
Dhini, Arian [1 ]
机构
[1] Univ Indonesia, Fac Engn, Dept Ind Engn, Depok, Indonesia
关键词
stock; high-dimensional data; clustering; EM; PCA;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent year, stock investor in Indonesia increased rapidly, so it is required to do analysis about the stock that helps the investor in their investment plan. Clustering is beneficial to select the appropriate stock fir investors. Unfortunately, stock prices keep varying from time to time. Consequently, it is not an easy work to select the stock for investment. In addition, stock price time series data are high dimensional data that influenced by many factors. In this study, high dimensional data are obtained by the time frame of each factor. Therefore, it is important to use a suitable technique to cluster high dimensional data. This paper presents High Dimensional Data Clustering (HDDC), a model-based clustering based on Gaussian Mixture Model, using the Expectation-Maximization (EM) algorithm. HDDC via EM algorithm gives a more robust result, and it possible to make an additional assumption. Moreover, this paper combines a high-dimensional clustering technique HDDC via EM algorithm and the most popular feature extraction technique Principal Component Analysis (PCA). This paper comparing methods of clustering technique HDDC and the combination between HDDC and PCA to know the most effective method which gives better result in clustering high-dimensional time series data. The 155 data features arc reduced to 7 principal components using PCA analysis. Despite PCA has increased the time efficiency of building the model, clustering technique HDDC via EM algorithm enables to handle the high-dimensional data better than the combination with PCA.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] An efficient clustering method of data mining for high-dimensional data
    Chang, JW
    Kang, HM
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 273 - 278
  • [2] An efficient clustering method for high-dimensional data mining
    Chang, JW
    Kim, YK
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2004, 2004, 3171 : 276 - 285
  • [3] CLINCH: Clustering incomplete high-dimensional data for data mining application
    Cheng, ZP
    Zhou, D
    Wang, C
    Guo, JK
    Wang, W
    Ding, BK
    Shi, B
    [J]. WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 88 - 99
  • [4] High-dimensional clustering method for high performance data mining
    Chang, Jae-Woo
    Lee, Hyun-Jo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 621 - +
  • [5] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [6] A Novel Approach for Clustering High-Dimensional Data using Kernel Hubness
    Amina, M.
    Farook, Syed K.
    [J]. 2015 Fifth International Conference on Advances in Computing and Communications (ICACC), 2015, : 94 - 97
  • [7] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    [J]. CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [8] Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
    Vijendra, Singh
    Laxman, Sahoo
    [J]. APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2013, 2013
  • [9] Subspace clustering of high-dimensional data: a predictive approach
    Brian McWilliams
    Giovanni Montana
    [J]. Data Mining and Knowledge Discovery, 2014, 28 : 736 - 772
  • [10] Subspace clustering of high-dimensional data: a predictive approach
    McWilliams, Brian
    Montana, Giovanni
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (03) : 736 - 772