Efficient Neural Architecture Search for Long Short-Term Memory Networks

被引:1
|
作者
Abed, Hamdi [1 ]
Gyires-Toth, Balint [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, Budapest, Hungary
关键词
Neural Architecture Search; Long Short-Term Memory; LSTM; sequence modelling; natural language processing;
D O I
10.1109/SAMI50585.2021.9378612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated machine learning (AutoML) is a technique which helps to determine the optimal or near-optimal model for a specific dataset and has been a focused research area during the last years. The automation of model design opens doors for non-machine learning experts to utilize machine learning models in several scenarios, which is both appealing for a wide range of researchers and for cloud services as well. Neural Architecture Search is a subfield of AutoML where the optimal artificial neural network model's architecture is generally searched with adaptive algorithms. This paper proposes a method to apply Efficient Neural Architecture Search (ENAS) to LSTM-like recurrent architecture, which uses a gating mechanism an inner memory. Using this method, the paper investigates if the handcrafted Long Short-Term Memory (LSTM) cell is an optimal or near-optimal solution of sequence modelling for a given dataset, or other, automatically defined recurrent structures outperform. The performance of vanilla LSTM, and advanced recurrent architectures designed by random search, and reinforcement learning-based ENAS are examined and compared. The proposed methods are evaluated in a text generation task on the Penn TreeBank dataset.
引用
收藏
页码:287 / 292
页数:6
相关论文
共 50 条
  • [1] Short-Term Traffic Prediction Using Long Short-Term Memory Neural Networks
    Abbas, Zainab
    Al-Shishtawy, Ahmad
    Girdzijauskas, Sarunas
    Vlassov, Vladimir
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 57 - 65
  • [2] Accelerating Inference In Long Short-Term Memory Neural Networks
    Mealey, Thomas
    Taha, Tarek M.
    [J]. NAECON 2018 - IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE, 2018, : 382 - 390
  • [3] An efficient Long Short-Term Memory model based on Laplacian Eigenmap in artificial neural networks
    Hu, Fang
    Zhu, Yanhui
    Liu, Jia
    Li, Liuhuan
    [J]. APPLIED SOFT COMPUTING, 2020, 91
  • [4] Long Short-Term Memory Neural Networks for Artificial Dialogue Generation
    Selouani, Sid Ahmed
    Yakoub, Mohammed Sidi
    [J]. 2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 761 - 768
  • [5] On Speaker Adaptation of Long Short-Term Memory Recurrent Neural Networks
    Miao, Yajie
    Metze, Florian
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1101 - 1105
  • [6] A COMPACT AND CONFIGURABLE LONG SHORT-TERM MEMORY NEURAL NETWORK HARDWARE ARCHITECTURE
    Chen, Kewei
    Huang, Leilei
    Li, Minjiang
    Zeng, Xiaoyang
    Fan, Yibo
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4168 - 4172
  • [7] Short-term memory in orthogonal neural networks
    White, OL
    Lee, DD
    Sompolinsky, H
    [J]. PHYSICAL REVIEW LETTERS, 2004, 92 (14) : 148102 - 1
  • [8] E-LSTM: An Efficient Hardware Architecture for Long Short-Term Memory
    Wang, Meiqi
    Wang, Zhisheng
    Lu, Jinming
    Lin, Jun
    Wang, Zhongfeng
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) : 280 - 291
  • [9] Long Short-Term Memory Neural Equalizer
    Wang, Zihao
    Xu, Zhifei
    He, Jiayi
    Delingette, Hervé
    Fan, Jun
    [J]. IEEE Transactions on Signal and Power Integrity, 2023, 2 (01): : 13 - 22
  • [10] Combining fuzzy clustering and improved long short-term memory neural networks for short-term load forecasting
    Liu, Fu
    Dong, Tian
    Liu, Qiaoliang
    Liu, Yun
    Li, Shoutao
    [J]. ELECTRIC POWER SYSTEMS RESEARCH, 2024, 226