SRDA: An efficient algorithm for large-scale discriminant analysis

被引:327
|
作者
Cai, Deng
He, Xiaofei
Han, Jiawei
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Yahoo, Burbank, CA 91504 USA
基金
美国国家科学基金会;
关键词
linear discriminant analysis; spectral regression; dimensionality reduction;
D O I
10.1109/TKDE.2007.190669
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linear Discriminant Analysis (LDA) has been a popular method for extracting features that preserves class separability. The projection functions of LDA are commonly obtained by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigendecomposition, which can be computationally expensive in both time and memory. Specifically, LDA has O(mnt + t(3) )time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features, and t = min(m,n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework that facilitates both efficient computation and the use of regularization techniques. Specifically, SRDA only needs to solve a set of regularized least squares problems, and there is no eigenvector computation involved, which is a huge save of both time and memory. Our theoretical analysis shows that SRDA can be computed with O(ms) time and O(ms) memory, where s(<= n) n is the average number of nonzero features in each sample. Extensive experimental results on four real-world data sets demonstrate the effectiveness and efficiency of our algorithm.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [21] URoad: An Efficient Algorithm for Large-scale Dynamic Ridesharing Service
    Fan, Jing
    Xu, Jinting
    Hou, Chenyu
    Cao, Bin
    Dong, Tianyang
    Cheng, Shiwei
    2018 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2018), 2018, : 9 - 16
  • [22] An efficient algorithm for large-scale quasi-supervised learning
    Karacali, Bilge
    PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (02) : 311 - 323
  • [23] An efficient algorithm for large-scale quasi-supervised learning
    Bilge Karaçalı
    Pattern Analysis and Applications, 2016, 19 : 311 - 323
  • [24] An Efficient Module Detection Algorithm for Large-Scale Complex Networks
    Sun, Chuangchuang
    Dai, Ran
    2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 4153 - 4158
  • [25] An Efficient Differential Grouping Algorithm for Large-Scale Global Optimization
    Kumar, Abhishek
    Das, Swagatam
    Mallipeddi, Rammohan
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2024, 28 (01) : 32 - 46
  • [26] Efficient large-scale data analysis using mapreduce
    Kubo, R., 1600, Nippon Telegraph and Telephone Corp. (10):
  • [27] Efficient bioinformatics approaches for large-scale data analysis
    Hautaniemi, S.
    FEBS JOURNAL, 2011, 278 : 27 - 27
  • [28] Linear Discriminant Analysis for Large-Scale data : Application on Text and Image data
    Elhadji Ille Gado, Nassara
    Grall-Maes, Edith
    Kharouf, Malika
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 961 - 964
  • [29] Wavelet analysis and processing algorithm for large-scale image
    Zhang, Jianhua
    Fu, Qianming
    Ma, Yan
    Lu, Chunxia
    DCABES 2006 Proceedings, Vols 1 and 2, 2006, : 325 - 328
  • [30] Large-scale maximum margin discriminant analysis using core vector machines
    Tsang, Ivor Wai-Hung
    Kocsor, Andras
    Kwok, James Tin-Yau
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (04): : 610 - 624