NASABN: A Neural Architecture Search Framework for Attention-Based Networks

被引:0
|
作者
Jing, Kun [1 ]
Xu, Jungang [1 ]
Xu, Hui [2 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
[2] Zugeng Technol, Beijing, Peoples R China
关键词
neural architecture search; recurrent neural network; language modeling; attention-based model;
D O I
10.1109/ijcnn48605.2020.9207600
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, neural architecture search (NAS) has emerged as a technique of growing concern in automatic machine learning (AutoML). Meanwhile, attention-based models, such as attention-based recurrent neural network, transformer-based model, etc., have been widely used in deep learning applications. However, there is no efficient NAS method that can search the architecture of attention-based model so far. To solve this problem, we propose a framework named neural architecture search for attention-based networks (NASABN) by abstracting attention-based models and extracting undefined parts of the model, including the attention layers and cells. NASABN is flexible and general enough to fit different NAS methods, which can also be transferred across different datasets. We conduct extensive experiments with NASABN using gradient descent-based methods like DARTS on Penn Treebank (PTB) and WikiText-2 (WT2) datasets respectively, and achieve competitive performance compared with the state-of-the-art methods.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Attention-Based Neural Architecture Search for Person Re-Identification
    Zhou, Qinqin
    Zhong, Bineng
    Liu, Xin
    Ji, Rongrong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6627 - 6639
  • [2] Neural Architecture Search for Convolutional Neural Networks with Attention
    Nakai, Kohei
    Matsubara, Takashi
    Uehara, Kuniaki
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (02) : 312 - 321
  • [3] Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks
    Lee, Juho
    Lee, Yoonho
    Kim, Jungtaek
    Kosiorek, Adam R.
    Choi, Seungjin
    Teh, Yee Whye
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] Attention-based graph neural networks: a survey
    Chengcheng Sun
    Chenhao Li
    Xiang Lin
    Tianji Zheng
    Fanrong Meng
    Xiaobin Rui
    Zhixiao Wang
    [J]. Artificial Intelligence Review, 2023, 56 : 2263 - 2310
  • [5] Attention-based graph neural networks: a survey
    Sun, Chengcheng
    Li, Chenhao
    Lin, Xiang
    Zheng, Tianji
    Meng, Fanrong
    Rui, Xiaobin
    Wang, Zhixiao
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 2263 - 2310
  • [6] A Unifying Framework of Attention-Based Neural Load Forecasting
    Xiong, Jing
    Zhang, Yu
    [J]. IEEE ACCESS, 2023, 11 : 51606 - 51616
  • [7] Attention-based Convolutional Neural Networks for Sentence Classification
    Zhao, Zhiwei
    Wu, Youzheng
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 705 - 709
  • [8] Seizure localisation with attention-based graph neural networks
    Grattarola, Daniele
    Livi, Lorenzo
    Alippi, Cesare
    Wennberg, Richard
    Valiante, Taufik A.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
  • [9] Causal Discovery with Attention-Based Convolutional Neural Networks
    Nauta, Meike
    Bucur, Doina
    Seifert, Christin
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01):
  • [10] Signal Peptides Generated by Attention-Based Neural Networks
    Wu, Zachary
    Yang, Kevin K.
    Liszka, Michael J.
    Lee, Alycia
    Batzilla, Alina
    Wernick, David
    Weiner, David P.
    Arnold, Frances H.
    [J]. ACS SYNTHETIC BIOLOGY, 2020, 9 (08): : 2154 - 2161