共 21 条
Convergence analysis of the batch gradient-based neuro-fuzzy learning algorithm with smoothing L1/2 regularization for the first-order Takagi-Sugeno system
被引:21
|作者:
Liu, Yan
[1
,2
]
Yang, Dakun
[3
]
机构:
[1] Dalian Polytech Univ, Sch Informat Sci & Engn, Dalian 116034, Peoples R China
[2] Natl Engn Res Ctr Seafood, Dalian 116034, Peoples R China
[3] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
基金:
中国国家自然科学基金;
中国博士后科学基金;
关键词:
Convergence;
First-order Takagi-Sugeno inference system;
Pi-Sigma network;
Smoothing L-1/2 regularization;
INFERENCE SYSTEM;
IDENTIFICATION;
NETWORK;
SCHEME;
D O I:
10.1016/j.fss.2016.07.003
中图分类号:
TP301 [理论、方法];
学科分类号:
081202 ;
摘要:
It has been proven that Takagi-Sugeno systems are universal approximators, and they are applied widely to classification and regression problems. The main challenges of these models are convergence analysis and their computational complexity due to the large number of connections and the pruning of unnecessary parameters. The neuro-fuzzy learning algorithm involves two tasks: generating comparable sparse networks and training the parameters. In addition, regularization methods have attracted increasing attention for network pruning, particularly the L-q (0 < q < 1) regularizer after L-1 regularization, which can obtain better solutions to sparsity problems. The L-1/2 regularizer has a specific sparsity capacity and it is representative of L-q (0 < q < 1) regularizations. However, the nonsmoothness of the L-1/2 regularizer may lead to oscillations in the learning process. In this study, we propose a gradient-based neuro-fuzzy learning algorithm with a smoothing L-1/2 regularization for the first-order Takagi-Sugeno fuzzy inference system. The proposed approach has the following three advantages: (i) it enhances the original L-1/2 regularizer by eliminating the oscillation of the gradient in the cost function during the training; (ii) it performs better by pruning inactive connections, where the number of the redundant connections for removal is higher than that generated by the original L-1/2 regularizer, while it is also implemented by simultaneous structure and parameter learning processes; and (iii) it is possible to demonstrate the theoretical convergence analysis of this learning method, which we focus on explicitly. We also provide a series of simulations to demonstrate that the smoothing L-1/2 regularization can often obtain more compressive representations than the current L-1/2 regularization. (C) 2016 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:28 / 49
页数:22
相关论文