Kernel gradient descent algorithm for information theoretic learning

被引:4
|
作者
Hu, Ting [1 ]
Wu, Qiang [2 ]
Zhou, Ding-Xuan [3 ]
机构
[1] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R China
[2] Middle Tennessee State Univ, Dept Math Sci, Murfreesboro, TN 37132 USA
[3] City Univ Hong Kong, Liu Bie Ju Ctr Math Sci, Sch Data Sci, Dept Math,Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Information theoretic learning; Minimum error entropy; Kernel method; Gradient descent algorithm; Regularization; ERROR; RATES; CLASSIFICATION; CRITERION;
D O I
10.1016/j.jat.2020.105518
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Information theoretic learning is a learning paradigm that uses concepts of entropies and divergences from information theory. A variety of signal processing and machine learning methods fall into this framework. Minimum error entropy principle is a typical one amongst them. In this paper, we study a kernel version of minimum error entropy methods that can be used to find nonlinear structures in the data. We show that the kernel minimum error entropy can be implemented by kernel based gradient descent algorithms with or without regularization. Convergence rates for both algorithms are deduced. Published by Elsevier Inc.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] An Information Theoretic Kernel Algorithm for Robust Online Learning
    Fan, Haijin
    Song, Qing
    Xu, Zhao
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [2] An information theoretic sparse kernel algorithm for online learning
    Fan, Haijin
    Song, Qing
    Xu, Zhao
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (09) : 4349 - 4359
  • [3] Learning gradients by a gradient descent algorithm
    Dong, Xuemei
    Zhou, Ding-Xuan
    [J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2008, 341 (02) : 1018 - 1027
  • [4] A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel Regime
    Lee, Da Eun
    Nakamura, Kensuke
    Tak, Jae-Ho
    Hong, Byung-Woo
    [J]. IEEE ACCESS, 2023, 11 : 85395 - 85404
  • [5] Learning rates of gradient descent algorithm for classification
    Dong, Xue-Mei
    Chen, Di-Rong
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 224 (01) : 182 - 192
  • [6] Accelerated Proximal Gradient Descent in Metric Learning for Kernel Regression
    Gonzalez, Hector
    Morell, Carlos
    Ferri, Francesc J.
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION, IWAIPR 2018, 2018, 11047 : 219 - 227
  • [7] Towards a unification of information theoretic learning and kernel methods
    Jenssen, R
    Erdogmus, D
    Principe, JC
    Eltoft, T
    [J]. MACHINE LEARNING FOR SIGNAL PROCESSING XIV, 2004, : 93 - 102
  • [8] Distributed kernel gradient descent algorithm for minimum error entropy principle
    Hu, Ting
    Wu, Qiang
    Zhou, Ding-Xuan
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 49 (01) : 229 - 256
  • [9] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [10] Network Gradient Descent Algorithm for Decentralized Federated Learning
    Wu, Shuyuan
    Huang, Danyang
    Wang, Hansheng
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2023, 41 (03) : 806 - 818