Breaking the Activation Function Bottleneck through Adaptive Parameterization

被引:0
|
作者
Flennerhag, Sebastian [1 ,2 ]
Yin, Hujun [1 ,2 ]
Keane, John [1 ]
Elliot, Mark [1 ]
机构
[1] Univ Manchester, Manchester, Lancs, England
[2] Alan Turing Inst, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure. We develop simple drop-in replacements that learn to adapt their parameterization conditional on the input, thereby increasing statistical efficiency significantly. We present an adaptive LSTM that advances the state of the art for the Penn Treebank and WikiText-2 word-modeling tasks while using fewer parameters and converging in less than half the number of iterations.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] THROUGH THE BOTTLENECK
    THEURKAUF, WE
    CURRENT BIOLOGY, 1994, 4 (01) : 76 - 78
  • [42] The bottleneck in AZT activation
    Arnon Lavie
    Ilme Schlichting
    Ingrid R. Vetter
    Manfred Konrads
    Jochen Reinstein
    Roger S. Goody
    Nature Medicine, 1997, 3 : 922 - 924
  • [43] The bottleneck in AZT activation
    Lavie, A
    Schlichting, I
    Vetter, IR
    Konrad, M
    Reinstein, J
    Goody, RS
    NATURE MEDICINE, 1997, 3 (08) : 922 - 924
  • [44] Breaking Through the Luminescence Stability Bottleneck of Oxyfluoride Phosphor for Sun-Like Led Lighting
    Li, Ying
    Fang, Shuangqiang
    Zhu, Qiangqiang
    Li, Shuxing
    Liu, Bo
    Feng, Fu
    Xie, Rongjun
    Wang, Le
    LASER & PHOTONICS REVIEWS, 2024, 18 (05)
  • [45] Breaking the capacity bottleneck of lithium-oxygen batteries through reconceptualizing transport and nucleation kinetics
    Zhang, Zhuojun
    Xiao, Xu
    Yan, Aijing
    Sun, Kai
    Yu, Jianwen
    Tan, Peng
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [46] Breaking the memory bottleneck with an optical data path
    Fritts, JE
    Chamberlain, RD
    35TH ANNUAL SIMULATION SYMPOSIUM, PROCEEDINGS, 2002, : 352 - 362
  • [47] Breaking the Synchronization Bottleneck with Reconfigurable Transactional Execution
    Li, Zhaoshi
    Liu, Leibo
    Deng, Yangdong
    Yin, Shouyi
    Wei, Shaojun
    IEEE COMPUTER ARCHITECTURE LETTERS, 2018, 17 (02) : 147 - 150
  • [48] Many targets but few leads - New computational approaches to breaking through the lead identification bottleneck
    Finn, PW
    ONCOLOGY RESEARCH, 2006, 15 (10-12) : 463 - 464
  • [49] Breaking the bottleneck: High speed medical image transmission through ATM network - Implementation and application
    Dai, HLL
    Meissner, MC
    Cleary, KR
    Rodgers, JE
    PACS DESIGN AND EVALUATION: ENGINEERING AND CLINICAL ISSUES - MEDICAL IMAGING 1997, 1997, 3035 : 108 - 112
  • [50] BREAKING THE BOTTLENECK TO REAL-TIME VIDEO
    不详
    ELECTRONICS, 1993, 66 (20): : 10 - 10