Kernel gradient descent algorithm for information theoretic learning

被引：4

作者：

Hu, Ting ^{[1
]}

Wu, Qiang ^{[2
]}

Zhou, Ding-Xuan ^{[3
]}

机构：

[1] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R China

[2] Middle Tennessee State Univ, Dept Math Sci, Murfreesboro, TN 37132 USA

[3] City Univ Hong Kong, Liu Bie Ju Ctr Math Sci, Sch Data Sci, Dept Math,Kowloon, Hong Kong, Peoples R China

来源：

JOURNAL OF APPROXIMATION THEORY | 2021年 / 263卷

基金：

中国国家自然科学基金;

关键词：

Information theoretic learning; Minimum error entropy; Kernel method; Gradient descent algorithm; Regularization; ERROR; RATES; CLASSIFICATION; CRITERION;

D O I：

10.1016/j.jat.2020.105518

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Information theoretic learning is a learning paradigm that uses concepts of entropies and divergences from information theory. A variety of signal processing and machine learning methods fall into this framework. Minimum error entropy principle is a typical one amongst them. In this paper, we study a kernel version of minimum error entropy methods that can be used to find nonlinear structures in the data. We show that the kernel minimum error entropy can be implemented by kernel based gradient descent algorithms with or without regularization. Convergence rates for both algorithms are deduced. Published by Elsevier Inc.

引用

页数：22

共 50 条

[1] An Information Theoretic Kernel Algorithm for Robust Online Learning
Fan, Haijin
Song, Qing
Xu, Zhao
[J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[2] An information theoretic sparse kernel algorithm for online learning
Fan, Haijin
Song, Qing
Xu, Zhao
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (09) : 4349 - 4359
[3] Learning gradients by a gradient descent algorithm
Dong, Xuemei
Zhou, Ding-Xuan
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2008, 341 (02) : 1018 - 1027
[4] A Continual Learning Algorithm Based on Orthogonal Gradient Descent Beyond Neural Tangent Kernel Regime
Lee, Da Eun
Nakamura, Kensuke
Tak, Jae-Ho
Hong, Byung-Woo
[J]. IEEE ACCESS, 2023, 11 : 85395 - 85404
[5] Learning rates of gradient descent algorithm for classification
Dong, Xue-Mei
Chen, Di-Rong
[J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 224 (01) : 182 - 192
[6] Accelerated Proximal Gradient Descent in Metric Learning for Kernel Regression
Gonzalez, Hector
Morell, Carlos
Ferri, Francesc J.
[J]. PROGRESS IN ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION, IWAIPR 2018, 2018, 11047 : 219 - 227
[7] Towards a unification of information theoretic learning and kernel methods
Jenssen, R
Erdogmus, D
Principe, JC
Eltoft, T
[J]. MACHINE LEARNING FOR SIGNAL PROCESSING XIV, 2004, : 93 - 102
[8] Distributed kernel gradient descent algorithm for minimum error entropy principle
Hu, Ting
Wu, Qiang
Zhou, Ding-Xuan
[J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 49 (01) : 229 - 256
[9] Learning to learn by gradient descent by gradient descent
Andrychowicz, Marcin
Denil, Misha
Colmenarejo, Sergio Gomez
Hoffman, Matthew W.
Pfau, David
Schaul, Tom
Shillingford, Brendan
de Freitas, Nando
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[10] Network Gradient Descent Algorithm for Decentralized Federated Learning
Wu, Shuyuan
Huang, Danyang
Wang, Hansheng
[J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2023, 41 (03) : 806 - 818

← 1 2 3 4 5 →