Deep learning theory of distribution regression with CNNs

被引:0
|
作者
Yu, Zhan [1 ]
Zhou, Ding-Xuan [2 ]
机构
[1] Hong Kong Baptist Univ, Dept Math, Kowloon Tong, 224 Waterloo Rd, Hong Kong, Peoples R China
[2] Univ Sydney, Sch Math & Stat, Sydney, NSW 2006, Australia
基金
美国国家科学基金会;
关键词
Learning theory; Deep learning; Distribution regression; Deep CNN; Oracle inequality; ReLU; NEURAL-NETWORKS; APPROXIMATION; BOUNDS;
D O I
10.1007/s10444-023-10054-y
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We establish a deep learning theory for distribution regression with deep convolutional neural networks (DCNNs). Deep learning based on structured deep neural networks has been powerful in practical applications. Generalization analysis for regression with DCNNs has been carried out very recently. However, for the distribution regression problem in which the input variables are probability measures, there is no mathematical model or theoretical analysis of DCNN-based learning theory. One of the difficulties is that the classical neural network structure requires the input variable to be a Euclidean vector. When the input samples are probability distributions, the traditional neural network structure cannot be directly used. A well-defined DCNN framework for distribution regression is desirable. In this paper, we overcome the difficulty and establish a novel DCNN-based learning theory for a two-stage distribution regression model. Firstly, we realize an approximation theory for functionals defined on the set of Borel probability measures with the proposed DCNN framework. Then, we show that the hypothesis space is well-defined by rigorously proving its compactness. Furthermore, in the hypothesis space induced by the general DCNN framework with distribution inputs, by using a two-stage error decomposition technique, we derive a novel DCNN-based two-stage oracle inequality and optimal learning rates (up to a logarithmic factor) for the proposed algorithm for distribution regression.
引用
收藏
页数:40
相关论文
共 50 条
  • [1] Deep learning theory of distribution regression with CNNs
    Zhan Yu
    Ding-Xuan Zhou
    [J]. Advances in Computational Mathematics, 2023, 49
  • [2] Learning theory for distribution regression
    [J]. 1600, Microtome Publishing (17):
  • [3] Learning Theory for Distribution Regression
    Szabo, Zoltan
    Sriperumbudur, Bharath K.
    Poczos, Barnabas
    Gretton, Arthur
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [4] Deep distribution regression
    Li, Rui
    Reich, Brian J.
    Bondell, Howard D.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 159
  • [5] Learning deep CNNs for impulse noise removal in images
    Jin, Lianghai
    Zhang, Wenhua
    Ma, Guangzhi
    Song, Enmin
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 62 : 193 - 205
  • [6] Robust regression with deep CNNs for facial age estimation: An empirical study
    Dornaika, F.
    Bekhouche, S. E.
    Arganda-Carreras, I
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 141
  • [7] Improving Object Detection Accuracy with Region and Regression Based Deep CNNs
    Qu, Liang
    Wang, Shengke
    Yang, Na
    Chen, Long
    Liu, Lu
    Zhang, Xiaoyan
    Gao, Feng
    Dong, Junyu
    [J]. 2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 318 - 323
  • [8] Learning Symmetry Consistent Deep CNNs for Face Completion
    Li, Xiaoming
    Hu, Guosheng
    Zhu, Jieru
    Zuo, Wangmeng
    Wang, Meng
    Zhang, Lei
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 7641 - 7655
  • [9] Theory for Deep Learning Regression Ensembles with Application to Raman Spectroscopy Analysis
    Li, Wenjing
    Paffenroth, Randy C.
    Timko, Michael T.
    Rando, Matthew P.
    Brown, Avery B.
    Deskins, N. Aaron
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 1049 - 1056
  • [10] Transfer Learning with Deep CNNs for Gender Recognition and Age Estimation
    Smith, Philip
    Chen, Cuixian
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2564 - 2571