An Efficient Distributed Database Clustering Algorithm for Big Data Processing

被引:0
|
作者
Sun, Qiao [1 ]
Fu, Lan-mei [1 ]
Deng, Bu-qiao [1 ]
Pei, Xu-bin [2 ]
Sun, Jia-song [3 ]
机构
[1] Beijing GuoDianTong Network Technol Co Ltd, Beijing, Peoples R China
[2] State Grid Zhejiang Elect Power Co Ltd, Hangzhou, Zhejiang, Peoples R China
[3] Tsinghua Univ, EE Dept, Beijing, Peoples R China
关键词
Distributed big data processing; Distributed database; Data clustering; Depth neural network; K-means;
D O I
10.23977/iccsc.2017.1012
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper proposes a distributed data clustering technique based on deep neural network. First, each record in the distributed database is taken as an input vector, and its characteristics are extracted and input to the input layer of the depth neural network. The weight of the connection is trained by BP algorithm, and the training of depth neural network output is realized by adjusting the weight. Finally, the data clustering results are judged according to the similarity of the current vector corresponding to the output data. Experimental results based on small-scale distributed systems show that this method has better test set accuracy than traditional k-means clustering method, and is more suitable for large-scale data clustering in the distributed environments.
引用
收藏
页码:70 / 74
页数:5
相关论文
共 50 条
  • [1] Efficient Distributed Database Clustering Algorithm for Big Data Processing
    Li, Liantian
    [J]. 2021 6TH INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2021), 2021, : 495 - 498
  • [2] An Efficient Distributed Algorithm for Big Data Processing
    Al-kahtani, Mohammed S.
    Karim, Lutful
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2017, 42 (08) : 3149 - 3157
  • [3] An Efficient Distributed Algorithm for Big Data Processing
    Mohammed S. Al-kahtani
    Lutful Karim
    [J]. Arabian Journal for Science and Engineering, 2017, 42 : 3149 - 3157
  • [4] Efficient algorithm for big data clustering on single machine
    Alguliyev, Rasim M.
    Aliguliyev, Ramiz M.
    Sukhostat, Lyudmila, V
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2020, 5 (01) : 9 - 14
  • [5] An Effective Distributed GHSOM Algorithm for Unsupervised Clustering on Big Data
    Chiu, Chui-Hui
    Chen, Jin-Jie
    Yu, Fang
    [J]. 2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 297 - 304
  • [6] Distributed Genetic Algorithm to Big Data Clustering A Novel Distributed Encoding Techniques
    Hajeer, Mustafa H.
    Dasgupta, Dipankar
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [7] User online behavior based on big data distributed clustering algorithm
    Wang, Yan
    [J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (02):
  • [8] Efficient big data security analysis on HDFS based on combination of clustering and data perturbation algorithm using health care database
    Marichamy, V. Santhana
    Natarajan, V
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 3355 - 3372
  • [9] CDRT: An Efficient Clustering Algorithm for Distributed Real-Time Database sites
    Abdel-kader, H. M.
    Salem, Rashed
    Saleh, Safa'a Said
    [J]. 2014 9th International Conference on Informatics and Systems (INFOS), 2014,
  • [10] A NOVEL ONLINE GENERALIZED POSSIBILISTIC CLUSTERING ALGORITHM FOR BIG DATA PROCESSING
    Xenaki, Spyridoula D.
    Koutroumbas, Konstantinos D.
    Rontogiannis, Athanasios A.
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2628 - 2632