A strong coreset algorithm to accelerate OPF as a graph-based machine learning in large-scale problems

被引:1
|
作者
Bostani, Hamid [1 ]
Sheikhan, Mansour [2 ]
Mahboobi, Behrad [3 ]
机构
[1] Islamic Azad Univ, South Tehran Branch, Young Researchers & Elite Club, Tehran, Iran
[2] Islamic Azad Univ, Dept Elect Engn, South Tehran Branch, Tehran, Iran
[3] Islamic Azad Univ, Commun Comp & Ind Network Res Ctr, Dept Elect & Comp Engn, Sci & Res Branch, Tehran, Iran
基金
美国国家科学基金会;
关键词
Coreset; Optimum-path forest; Large-scale problems; Massive datasets; OPTIMUM-PATH FOREST; INTRUSION DETECTION; CLASSIFICATION; HYBRID;
D O I
10.1016/j.ins.2020.10.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Optimum-path forest (OPF) is one of the efficient graph-based frameworks that can determine the patterns of input dataset by extracting the optimal partitions of graph obtained through encoding data into a graph. Since OPF was introduced based on simple assumptions without considering the requirements of large-scale problems, this machine learning is an effective algorithm only for a reasonable size of input datasets. To provide a scalable OPF, this study introduces a strong coreset for accelerating OPF algorithm. Applying this approach can expedite OPF procedure, especially when it is working on massive datasets. Accordingly, a novel algebra is developed to represent the problem of OPF as an optimization problem for the proposed coreset definition. A novel coreset construction algorithm that can approximate the OPF solutions is subsequently proposed in order to improve the OPF construction speed. The simulation results of diverse experiments on various benchmark datasets illustrate computation gain and superiority of the proposed algorithm in terms of the construction and classification speeds as compared to the original algorithm while displaying reliably accurate performance. The presented coreset construction algorithm performs the training and testing phases of OPF up to 6.1 and 4.9 times faster than before, respectively. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:424 / 441
页数:18
相关论文
共 50 条
  • [1] Coreset-based Conformal Prediction for Large-scale Learning
    Riquelme-Granada, Nery
    Khuong An Nguyen
    Luo, Zhiyuan
    CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 105, 2019, 105
  • [2] Graph-Based Deep Decomposition for Overlapping Large-Scale Optimization Problems
    Zhang, Xin
    Ding, Bo-Wen
    Xu, Xin-Xin
    Li, Jian-Yu
    Zhan, Zhi-Hui
    Qian, Pengjiang
    Fang, Wei
    Lai, Kuei-Kuei
    Zhang, Jun
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (04): : 2374 - 2386
  • [3] A novel graph-based partitioning algorithm for large-scale dynamical systems
    Kamelian, Saeed
    Salahshoor, Karim
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2015, 46 (02) : 227 - 245
  • [4] Graph-based strategy evaluation for large-scale multiagent reinforcement learning
    Sun, Yiyun
    Liu, Meiqin
    Zhang, Senlin
    Zheng, Ronghao
    Dong, Shanling
    SCIENCE CHINA-INFORMATION SCIENCES, 2025, 68 (08)
  • [5] Machine Learning Based Graph Mining of Large-scale Network and Optimization
    Liu, Mingyue
    PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21), 2021,
  • [6] Extreme Learning Machine for Large-Scale Graph Classification Based on MapReduce
    Wang, Zhanghui
    Zhao, Yuhai
    Wang, Guoren
    PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 93 - 105
  • [7] Extreme Learning Machine for large-scale graph classification based on MapReduce
    Wang, Zhanghui
    Zhao, Yuhai
    Yuan, Ye
    Wang, Guoren
    Chen, Lei
    NEUROCOMPUTING, 2017, 261 : 106 - 114
  • [8] Grid graph-based large-scale point clouds registration
    Han, Yi
    Zhang, Guangyun
    Zhang, Rongting
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2023, 16 (01) : 2448 - 2466
  • [9] Graph-based visual analysis for large-scale hydrological modeling
    Leonard, Lorne
    MacEachren, Alan M.
    Madduri, Kamesh
    INFORMATION VISUALIZATION, 2017, 16 (03) : 205 - 216
  • [10] A graph-based cache for large-scale similarity search engines
    Gil-Costa, Veronica
    Marin, Mauricio
    Bonacic, Carolina
    Solar, Roberto
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (05): : 2006 - 2034