Learning flat representations with artificial neural networks

被引:0
|
作者
Vlad Constantinescu
Costin Chiru
Tudor Boloni
Adina Florea
Robi Tacutu
机构
[1] Institute of Biochemistry of the Romanian Academy,Systems Biology of Aging Group
[2] University Politehnica of Bucharest,Computer Science and Engineering Department
[3] AITIAOne Inc.,undefined
[4] Chronos Biosystems SRL,undefined
来源
Applied Intelligence | 2021年 / 51卷
关键词
Learning representations; Infomax; Beta distribution; Vanishing gradients;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we propose a method of learning representation layers with squashing activation functions within a deep artificial neural network which directly addresses the vanishing gradients problem. The proposed solution is derived from solving the maximum likelihood estimator for components of the posterior representation, which are approximately Beta-distributed, formulated in the context of variational inference. This approach not only improves the performance of deep neural networks with squashing activation functions on some of the hidden layers - including in discriminative learning - but can be employed towards producing sparse codes.
引用
收藏
页码:2456 / 2470
页数:14
相关论文
共 50 条
  • [1] Learning flat representations with artificial neural networks
    Constantinescu, Vlad
    Chiru, Costin
    Boloni, Tudor
    Florea, Adina
    Tacutu, Robi
    [J]. APPLIED INTELLIGENCE, 2021, 51 (04) : 2456 - 2470
  • [2] Learning efficiently with neural networks: A theoretical comparison between structured and flat representations
    Gori, M
    Frasconi, P
    Sperduti, A
    [J]. ECAI 2000: 14TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, 54 : 301 - 305
  • [3] Representations and generalization in artificial and brain neural networks
    Li, Qianyi
    Sorscher, Ben
    Sompolinsky, Haim
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (27)
  • [4] Statistical physics and representations in real and artificial neural networks
    Cocco, S.
    Monasson, R.
    Posani, L.
    Rosay, S.
    Tubiana, J.
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2018, 504 : 45 - 76
  • [5] Convergent Temperature Representations in Artificial and Biological Neural Networks
    Haesemeyer, Martin
    Schier, Alexander F.
    Engert, Florian
    [J]. NEURON, 2019, 103 (06) : 1123 - +
  • [6] Deep Neural Networks for Learning Graph Representations
    Cao, Shaosheng
    Lu, Wei
    Xu, Qiongkai
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1145 - 1152
  • [7] SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks
    Wermter, S
    Weber, V
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1997, 6 : 35 - 85
  • [8] Complexity of learning in artificial neural networks
    Engel, A
    [J]. THEORETICAL COMPUTER SCIENCE, 2001, 265 (1-2) : 285 - 306
  • [9] Lifelong Learning in Artificial Neural Networks
    Anthes, Gary
    [J]. COMMUNICATIONS OF THE ACM, 2019, 62 (06) : 13 - 15
  • [10] Artificial neural networks and deep learning
    Geubbelmans, Melvin
    Rousseau, Axel-Jan
    Burzykowski, Tomasz
    Valkenborg, Dirk
    [J]. AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 2024, 165 (02) : 248 - 251