LSTM-CRP: Algorithm-Hardware Co-Design and Implementation of Cache Replacement Policy Using Long Short-Term Memory

被引:0
|
作者
Wang, Yizhou [1 ]
Meng, Yishuo [1 ]
Wang, Jiaxing [1 ]
Yang, Chen [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Microelect, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
memory bottleneck; cache replacement policy; long short-term memory; LSTM hardware accelerator; lightweight;
D O I
10.3390/bdcc8100140
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As deep learning has produced dramatic breakthroughs in many areas, it has motivated emerging studies on the combination between neural networks and cache replacement algorithms. However, deep learning is a poor fit for performing cache replacement in hardware implementation because its neural network models are impractically large and slow. Many studies have tried to use the guidance of the Belady algorithm to speed up the prediction of cache replacement. But it is still impractical to accurately predict the characteristics of future access addresses, introducing inaccuracy in the discrimination of complex access patterns. Therefore, this paper presents the LSTM-CRP algorithm as well as its efficient hardware implementation, which employs the long short-term memory (LSTM) for access pattern identification at run-time to guide cache replacement algorithm. LSTM-CRP first converts the address into a novel key according to the frequency of the access address and a virtual capacity of the cache, which has the advantages of low information redundancy and high timeliness. Using the key as the inputs of four offline-trained LSTM network-based predictors, LSTM-CRP can accurately classify different access patterns and identify current cache characteristics in a timely manner via an online set dueling mechanism on sampling caches. For efficient implementation, heterogeneous lightweight LSTM networks are dedicatedly constructed in LSTM-CRP to lower hardware overhead and inference delay. The experimental results show that LSTM-CRP was able to averagely improve the cache hit rate by 20.10%, 15.35%, 12.11% and 8.49% compared with LRU, RRIP, Hawkeye and Glider, respectively. Implemented on Xilinx XCVU9P FPGA at the cost of 15,973 LUTs and 1610 FF registers, LSTM-CRP was running at a 200 MHz frequency with 2.74 W power consumption.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] FUEL MOISTURE CONTENT FORECASTING USING LONG SHORT-TERM MEMORY(LSTM) MODEL
    Kang, Zhenyu
    Jiao, Miao
    Zhou, Zijie
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 5672 - 5675
  • [22] Rainfall-runoff modelling using Long Short-Term Memory (LSTM) networks
    Kratzert, Frederik
    Klotz, Daniel
    Brenner, Claire
    Schulz, Karsten
    Herrnegger, Mathew
    HYDROLOGY AND EARTH SYSTEM SCIENCES, 2018, 22 (11) : 6005 - 6022
  • [23] Wind Speed Prediction and Visualization Using Long Short-Term Memory Networks (LSTM)
    Ehsan, Amimul
    Shahirinia, Amir
    Zhang, Nian
    Oladunni, Timothy
    2020 10TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2020, : 234 - 240
  • [24] Short-Term Load Forecasting using Long Short Term Memory Optimized by Genetic Algorithm
    Zulfiqar, Muhammad
    Rasheed, Muhammad Babar
    2022 IEEE SUSTAINABLE POWER AND ENERGY CONFERENCE (ISPEC), 2022,
  • [25] Efficient Implementation of QRD-RLS Algorithm using Hardware-Software Co-design
    Lodha, Nupur
    Rai, Nivesh
    Krishnamurthy, Aarthy
    Venkataraman, Hrishikesh
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 2973 - +
  • [26] Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks
    Zazo, Ruben
    Lozano-Diez, Alicia
    Gonzalez-Dominguez, Javier
    Toledano, Doroteo T.
    Gonzalez-Rodriguez, Joaquin
    PLOS ONE, 2016, 11 (01):
  • [27] FPGA Hardware Implementation of Efficient Long Short-Term Memory Network Based on Construction Vector Method
    Li, Tengfei
    Gu, Shenshen
    IEEE ACCESS, 2023, 11 : 122357 - 122367
  • [28] Algorithm and hardware co-design co-optimization framework for LSTM accelerator using quantized fully decomposed tensor train
    Liu, Mingshuo
    Yin, Miao
    Han, Kevin
    Demara, Ronald F.
    Yuan, Bo
    Bai, Yu
    INTERNET OF THINGS, 2023, 22
  • [29] PM2.5 Forecast in Korea using the Long Short-Term Memory (LSTM) Model
    Chang-Hoi Ho
    Ingyu Park
    Jinwon Kim
    Jae-Bum Lee
    Asia-Pacific Journal of Atmospheric Sciences, 2023, 59 : 563 - 576
  • [30] Multiple Sequence Behavior Recognition on Humanoid Robot using Long Short-Term Memory (LSTM)
    How, Dickson Neoh Tze
    Sahari, Khairul Salleh Mohamed
    Hu Yuhuang
    Kiong, Loo Chu
    2014 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTICS AND MANUFACTURING AUTOMATION (ROMA), 2014, : 109 - 114