Multitask Coupled Logistic Regression and Its Fast Implementation for Large Multitask Datasets

被引:18
|
作者
Gu, Xin [1 ]
Chung, Fu-Lai [2 ]
Ishibuchi, Hisao [3 ]
Wang, Shitong [1 ]
机构
[1] Jiangnan Univ, Sch Digital Media, Wuxi 214122, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[3] Osaka Prefecture Univ, Dept Comp Sci, Osaka 5998531, Japan
基金
日本学术振兴会; 中国国家自然科学基金;
关键词
Dual coordinate descent method (CDdual); logistic regression (LR); multitask classification learning (MTC); posterior probability; MULTIPLE TASKS;
D O I
10.1109/TCYB.2014.2362771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When facing multitask-learning problems, it is desirable that the learning method could find the correct input-output features and share the commonality among multiple domains and also scale-up for large multitask datasets. We introduce the multitask coupled logistic regression (LR) framework called LR-based multitask classification learning algorithm (MTC-LR), which is a new method for generating each classifier for each task, capable of sharing the commonality among multitask domains. The basic idea of MTC-LR is to use all individual LR based classifiers, each one appropriate for each task domain, but in contrast to other support vector machine (SVM)-based proposals, learning all the parameter vectors of all individual classifiers by using the conjugate gradient method, in a global way and without the use of kernel trick, and being easily extended into its scaled version. We theoretically show that the addition of a new term in the cost function of the set of LRs (that penalizes the diversity among multiple tasks) produces a coupling of multiple tasks that allows MTC-LR to improve the learning performance in a LR way. This finding can make us easily integrate it with a state-of-the-art fast LR algorithm called dual coordinate descent method (CDdual) to develop its fast version MTC-LR-CDdual for large multitask datasets. The proposed algorithm MTC-LR-CDdual is also theoretically analyzed. Our experimental results on artificial and real-datasets indicate the effectiveness of the proposed algorithm MTC-LR-CDdual in classification accuracy, speed, and robustness.
引用
收藏
页码:1953 / 1966
页数:14
相关论文
共 14 条
  • [1] Integrative analysis of multiple diverse omics datasets by sparse group multitask regression
    Lin, Dongdong
    Zhang, Jigang
    Li, Jingyao
    He, Hao
    Deng, Hong-Wen
    Wang, Yu-Ping
    [J]. FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2014, 2
  • [2] Comparison of fast regression algorithms in large datasets
    Cangur, Sengul
    Ankarali, Handan
    [J]. KUWAIT JOURNAL OF SCIENCE, 2023, 50 (02)
  • [3] Parallel Multiclass Logistic Regression for Classifying Large Scale Image Datasets
    Thanh-Nghi Do
    Poulet, Francois
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2015, 358 : 255 - 266
  • [4] Memristive Circuit Implementation of Context-Dependent Emotional Learning Network and Its Application in Multitask
    Xu, Cong
    Wang, Chunhua
    Jiang, Jinguang
    Sun, Jingru
    Lin, Hairong
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (09) : 3052 - 3065
  • [5] Fast robust variable selection using VIF regression in large datasets
    Seo, Han Son
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2018, 31 (04) : 463 - 473
  • [6] Algorithms for fast large scale data mining using logistic regression
    Rouhani-Kalleh, Omid
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 155 - 162
  • [7] Logistic regression frequently outperformed propensity score methods especially for large datasets: a simulation study
    Wilkinson, Jack D.
    Mamas, Mamas A.
    Kontopantelis, Evangelos
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2022, 152 : 176 - 184
  • [8] An extended Newton-type algorithm for l2-regularized sparse logistic regression and its efficiency for classifying large-scale datasets
    Wang, Rui
    Xiu, Naihua
    Zhou, Shenglong
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2021, 397
  • [9] A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression
    Shi, Jianing
    Yin, Wotao
    Osher, Stanley
    Sajda, Paul
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 713 - 741
  • [10] Fast Matrix-vector Multiplications for Large-scale Logistic Regression on Shared-memory Systems
    Lee, Mu-Chu
    Chiang, Wei-Lin
    Lin, Chih-Jen
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 835 - 840