GANY: A Genetic Spectral-based Clustering Algorithm for Large Data Analysis

被引:0
|
作者
Menendez, Hector D. [1 ]
Camacho, David [1 ]
机构
[1] Univ Autonoma Madrid, Dept Comp Sci, Madrid, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, Data analysis is one of the most growing fields. The big amounts of data are making their analysis a really challenging area. The most relevant techniques are mainly divided in two sub-domains: Classification and Clustering. Even though Classification is currently growing and evolving, one of the promising techniques to deal with the Large Data Analysis is Clustering, because Classification needs human supervision, which makes the analysis more expensive. Clustering is a blind process used to group data by similarity. Currently, the most relevant methods are those based on manifold identification. The main idea behind these techniques is to group data using the form they define in the space. In order to achieve this goal, there are several techniques based on Spectral Analysis which deal with this problem. However, these techniques are not suitable for Large Data, due to they require a lot of memory to determine the groups. Besides, there are some problems of local minima convergence in these techniques which are common in statistical methodologies. This work is focused on combining Genetic Algorithms with spectral-based methodologies to deal with the Large Data Analysis problem. Here, we will combine the Nystrom method with the Spectrum to generate an approximation of the problem to an accurate summary of the search space. Also a genetic algorithm is used to reduce the local minimum convergence problem in the new search space. The performance of this methodology has been evaluated using the accuracy with both, synthetic and real-world datasets extracted from the literature.
引用
收藏
页码:640 / 647
页数:8
相关论文
共 50 条
  • [1] SACOC: A Spectral-Based ACO Clustering Algorithm
    Menendez, Hector D.
    Otero, Fernando E. B.
    Camacho, David
    [J]. INTELLIGENT DISTRIBUTED COMPUTING VIII, 2015, 570 : 185 - 194
  • [2] CLUSTERING STUDY BASED ON A LARGE DATA SET OF QUANTUM GENETIC SPECTRAL CLUSTERING ALGORITHM
    Jiang Yong
    Tan Huailiang
    Li Guangwen
    Zhou Hengwei
    [J]. 2011 INTERNATIONAL CONFERENCE ON INSTRUMENTATION, MEASUREMENT, CIRCUITS AND SYSTEMS (ICIMCS 2011), VOL 3: COMPUTER-AIDED DESIGN, MANUFACTURING AND MANAGEMENT, 2011, : 435 - 440
  • [3] SpecRp : A spectral-based community embedding algorithm
    Tautenhain, Camila P. S.
    Nascimento, Maria C. V.
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2022, 9
  • [4] A Genetic Algorithm Approach for Clustering Large Data Sets
    Luchi, Diego
    Rodrigues, Alexandre
    Varejao, Flavio Miguel
    Santos, Willian
    [J]. 2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 570 - 576
  • [5] A genetic algorithm for clustering on very large data sets
    Gasvoda, J
    Ding, Q
    [J]. COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
  • [6] A genetic spectral clustering algorithm
    Wang, Huiqing
    Chen, Junjie
    Guo, Kai
    [J]. Journal of Computational Information Systems, 2011, 7 (09): : 3245 - 3252
  • [7] Data Clustering Based on Approach of Genetic Algorithm
    Wang, Hai-hui
    Zhao, Wen-jie
    [J]. 2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2753 - 2757
  • [8] Mutual Coupling Compensation on Spectral-based DOA Algorithm
    Sanudin, R.
    [J]. INTERNATIONAL ENGINEERING RESEARCH AND INNOVATION SYMPOSIUM (IRIS), 2016, 160
  • [9] Using genetic algorithms for spectral-based printer characterization
    Zuffi, S
    Schettini, R
    Mauri, G
    [J]. COLOR IMAGING VIII: PROCESSING, HARDCOPY, AND APPLICATIONS, 2003, 5008 : 268 - 275
  • [10] COMPARISON OF CLOSURE TO SPECTRAL-BASED LARGE EDDY SIMULATIONS
    HERRING, JR
    [J]. PHYSICS OF FLUIDS A-FLUID DYNAMICS, 1990, 2 (06): : 979 - 983