Parameter Transfer Unit for Deep Neural Networks

被引:12
|
作者
Zhang, Yinghua [1 ]
Zhang, Yu [1 ]
Yang, Qiang [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
关键词
Transfer learning; Deep neural networks;
D O I
10.1007/978-3-030-16145-3_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parameters in deep neural networks which are trained on large-scale databases can generalize across multiple domains, which is referred as "transferability". Unfortunately, the transferability is usually defined as discrete states and it differs with domains and network architectures. Existing works usually heuristically apply parameter-sharing or fine-tuning, and there is no principled approach to learn a parameter transfer strategy. To address the gap, a Parameter Transfer Unit (PTU) is proposed in this paper. PTU learns a fine-grained nonlinear combination of activations from both the source domain network and the target domain network, and subsumes hand-crafted discrete transfer states. In the PTU, the transferability is controlled by two gates which are artificial neurons and can be learned from data. The PTU is a general and flexible module which can be used in both CNNs and RNNs. It can be also integrated with other transfer learning methods in a plug-and-play manner. Experiments are conducted with various network architectures and multiple transfer domain pairs. Results demonstrate the effectiveness of the PTU as it outperforms heuristic parameter-sharing and fine-tuning in most settings.
引用
收藏
页码:82 / 95
页数:14
相关论文
共 50 条
  • [31] ATTL: An Automated Targeted Transfer Learning with Deep Neural Networks
    Ahamed, Sayyed Farid
    Aggarwal, Priyanka
    Shetty, Sachin
    Lanus, Erin
    Freeman, Laura J.
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [32] Japanese Animation Style Transfer using Deep Neural Networks
    Ye, Shiyang
    Ohtera, Ryo
    PROCEEDINGS OF THE 2017 IEEE INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND ENGINEERING (IEEE-ICICE 2017), 2017, : 492 - 495
  • [33] A Transfer Learning Evaluation of Deep Neural Networks for Image Classification
    Abou Baker, Nermeen
    Zengeler, Nico
    Handmann, Uwe
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2022, 4 (01): : 22 - 41
  • [34] Layer Removal for Transfer Learning with Deep Convolutional Neural Networks
    Zhi, Weiming
    Chen, Zhenghao
    Yueng, Henry Wing Fung
    Lu, Zhicheng
    Zandavi, Seid Miad
    Chung, Yuk Ying
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 460 - 469
  • [35] Parameter masks for close talk speech segregation using deep neural networks
    Jiang, Yi
    Liu, Runsheng
    2015 7TH INTERNATIONAL CONFERENCE ON MECHANICAL AND ELECTRONICS ENGINEERING (ICMEE 2015), 2015, 31
  • [36] Convergence Analysis of PSO for Hyper-Parameter Selection in Deep Neural Networks
    Nalepa, Jakub
    Lorenzo, Pablo Ribalta
    ADVANCES ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC-2017), 2018, 13 : 284 - 295
  • [37] QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
    Kryzhanovskiy, Vladimir
    Balitskiy, Gleb
    Kozyrskiy, Nikolay
    Zuruev, Aleksandr
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10679 - 10687
  • [38] Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
    Mostafa, Hesham
    Wang, Xin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [39] Particle Swarm Optimization for Hyper-Parameter Selection in Deep Neural Networks
    Lorenzo, Pablo Ribalta
    Nalepa, Jakub
    Kawulok, Michal
    Sanchez Ramos, Luciano
    Ranilla Pastor, Jose
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 481 - 488
  • [40] Deep inference: A Convolutional Neural Networks Method for Parameter Recovery of the Fractional Dynamics
    Biranvand, N.
    Hadian-Rasanan, A. H.
    Khalili, A.
    Rad, J. A.
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (01): : 189 - 201