Fast Semisupervised Learning With Bipartite Graph for Large-Scale Data

被引:26
|
作者
He, Fang [1 ,2 ]
Nie, Feiping [2 ,3 ]
Wang, Rong [2 ]
Li, Xuelong [2 ,3 ]
Jia, Weimin [4 ]
机构
[1] Xian Res Inst Hitech, Xian 710025, Peoples R China
[2] Northwestern Polytech Univ, Ctr Opt IMagery Anal & Learning OPTIMAL, Xian 710072, Peoples R China
[3] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
[4] Xian Res Inst Hitech, Dept Commun Engn, Xian 710025, Peoples R China
基金
中国国家自然科学基金;
关键词
Bipartite graph; large-scale data; out-of-sample; semisupervised learning (SSL); SEMI-SUPERVISED CLASSIFICATION; VECTOR MACHINES; SEARCH;
D O I
10.1109/TNNLS.2019.2908504
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the captured information in our real word is very scare and labeling sample is time cost and expensive, semisupervised learning (SSL) has an important application in computer vision and machine learning. Among SSL approaches, a graph-based SSL (GSSL) model has recently attracted much attention for high accuracy. However, for most traditional GSSL methods, the large-scale data bring higher computational complexity, which acquires a better computing platform. In order to dispose of these issues, we propose a novel approach, bipartite GSSL normalized (BGSSL-normalized) method, in this paper. This method consists of three parts. First, the bipartite graph between the original data and the anchor points is constructed, which is parameter-insensitive, scale-invariant, naturally sparse, and simple operation. Then, the label of the original data and anchors can be inferred through the graph. Besides, we extend our algorithm to handle out-of-sample for large-scale data by the inferred label of anchors, which not only retains good classification result but also saves a large amount of time. The computational complexity of BGSSL-normalized can be reduced to O(ndm+nm(2)), which is a significant improvement compared with traditional GSSL methods that need O(n(2)d+n(3)), where n, d, and m are the number of samples, features, and anchors, respectively. The experimental results on several publicly available data sets demonstrate that our approaches can achieve better classification accuracy with less time costs.
引用
收藏
页码:626 / 638
页数:13
相关论文
共 50 条
  • [1] Fast spectral clustering learning with hierarchical bipartite graph for large-scale data
    Yang, Xiaojun
    Yu, Weizhong
    Wang, Rong
    Zhang, Guohao
    Nie, Feiping
    [J]. PATTERN RECOGNITION LETTERS, 2020, 130 : 345 - 352
  • [2] Graph-Based Semi-Supervised Learning with Bipartite Graph for Large-Scale Data and Prediction of Unseen Data
    Alemi, Mohammad
    Bosaghzadeh, Alireza
    Dornaika, Fadi
    [J]. Information (Switzerland), 2024, 15 (10)
  • [3] Large-Scale Clustering With Structured Optimal Bipartite Graph
    Zhang, Han
    Nie, Feiping
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9950 - 9963
  • [4] Large-scale Graph Representation Learning
    Leskovec, Jure
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4 - 4
  • [5] Learning Distilled Graph for Large-Scale Social Network Data Clustering
    Liu, Wenhe
    Gong, Dong
    Tan, Mingkui
    Shi, Javen Qinfeng
    Yang, Yi
    Hauptmann, Alexander G.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1393 - 1404
  • [6] Large-Scale Robust Semisupervised Classification
    Zhang, Lingling
    Luo, Minnan
    Li, Zhihui
    Nie, Feiping
    Zhang, Huaxiang
    Liu, Jun
    Zheng, Qinghua
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 907 - 917
  • [7] Large-scale knowledge graph representation learning
    Badrouni, Marwa
    Katar, Chaker
    Inoubli, Wissem
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (09) : 5479 - 5499
  • [8] Fast Semisupervised Classification Using Histogram-Based Density Estimation for Large-Scale Polarimetric SAR Data
    Liu, Hongying
    Wang, Feixiang
    Yang, Shuyuan
    Hou, Biao
    Jiao, Licheng
    Yang, Ri
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2019, 16 (12) : 1844 - 1848
  • [9] GNNVis: Visualize Large-Scale Data by Learning a Graph Neural Network Representation
    Huang, Yajun
    Zhang, Jingbin
    Yang, Yiyang
    Gong, Zhiguo
    Hao, Zhifeng
    [J]. CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 545 - 554
  • [10] Large-Scale Multi-View Spectral Clustering via Bipartite Graph
    Li, Yeqing
    Nie, Feiping
    Huang, Heng
    Huang, Junzhou
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2750 - 2756