Hidden-ROM: A Compute-in-ROM Architecture to Deploy Large-Scale Neural Networks on Chip with Flexible and Scalable Post-Fabrication Task Transfer Capability

被引:0
|
作者
Chen, Yiming [1 ]
Yin, Guodong [1 ]
Lee, Mingyen [1 ]
Tang, Wenjun [1 ]
Yang, Zekun [1 ]
Liu, Yongpan [1 ]
Yang, Huazhong [1 ]
Li, Xueqing [1 ]
机构
[1] Tsinghua Univ, Elect Engn Dept, BNRist, ICFC, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Computing-in-Memory; ROM-CiM; Read-Only Memory; YOLoC; Hidden Neural Network; HNN; Processing-in-Memory; PIM; Task transfer; MEMORY SRAM MACRO;
D O I
10.1145/3508352.3549335
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Motivated by reducing the data transfer activities in data-intensive neural network computing, SRAM-based compute-in-memory (CiM) has made significant progress. Unfortunately, SRAM has low density and limited on-chip capacity. This makes the deployment of large models inefficient due to the frequent DRAM access to update the weight in SRAM. Recently, a ROM-based CiM design, YOLoC, reveals the unique opportunity of deploying a large-scale neural network in CMOS by exploring the intriguing high density of ROM. However, even though assisting SRAM has been adopted in YOLoC for task transfer within the same domain, it is still a big challenge to overcome the read-only limitation in ROM and enable more flexibility. Therefore, it is of paramount significance to develop new ROM-based CiM architectures and provide broader task space and model expansion capability for more complex tasks. This paper presents Hidden-ROM for high flexibility of ROM-based CiM. Hidden-ROM provides several novel ideas beyond YOLoC. First, it adopts a one-SRAM-many-ROM method that "hides" ROM cells to support various datasets of different domains, including CIFAR10/100, FER2013, and ImageNet. Second, Hidden-ROM provides the model expansion capability after chip fabrication to update the model for more complex tasks when needed. Experiments show that Hidden-ROM designed for ResNet-18 pretrained on CIFAR100 (item classification) can achieve <0.5% accuracy loss in FER2013 (facial expression recognition), while YOLoC degrades by >40%. After expanding to ResNet-50/101, Hidden-ROM even achieves 68.6%/72.3% accuracy in ImageNet, close to 74.9%/76.4% by software. Such expansion costs only 7.6%/12.7% energy efficiency overhead while providing 12%/16% accuracy improvement after expansion.
引用
收藏
页数:9
相关论文
共 1 条
  • [1] YOLoC: DeploY Large-Scale Neural Network by ROM-based Computing-in-Memory using ResiduaL Branch on a Chip
    Chen, Yiming
    Yin, Guodong
    Tan, Zhanhong
    Lee, Mingyen
    Yang, Zekun
    Liu, Yongpan
    Yang, Huazhong
    Ma, Kaisheng
    Li, Xueqing
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1093 - 1098