ProFormer: Towards On-Device LSH Projection Based Transformers

被引:0
|
作者
Sankar, Chinnadhurai [1 ]
Ravi, Sujith [2 ]
Kozareva, Zornitsa [3 ]
机构
[1] Univ Montreal, Mila, Montreal, PQ, Canada
[2] Amazon Alexa, Sunnyvale, CA USA
[3] Google, Mountain View, CA 94043 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
At the heart of text based neural models lay word representations, which are powerful but occupy a lot of memory making it challenging to deploy to devices with memory constraints such as mobile phones, watches and IoT. To surmount these challenges, we introduce ProFormer - a projection based transformer architecture that is faster and lighter making it suitable to deploy to memory constraint devices and preserve user privacy. We use LSH projection layer to dynamically generate word representations on-the-fly without embedding lookup tables leading to significant memory footprint reduction from O(V:d) to O(T), where V is the vocabulary size, d is the embedding dimension size and T is the dimension of the LSH projection representation. We also propose a local projection attention (LPA) layer, which uses self-attention to transform the input sequence of N LSH word projections into a sequence of N=K representations reducing the computations quadratically by O(K-2). We evaluate ProFormer on multiple text classification tasks and observed improvements over prior state-of-the-art on-device approaches for short text classification and comparable performance for long text classification tasks. ProFormer is also competitive with other popular but highly resource-intensive approaches like BERT and even outperforms small-sized BERT variants with significant resource savings - reduces the embedding memory footprint from 92.16 MB to 1.7 KB and requires 16x less computation overhead, which is very impressive making it the fastest and smallest on-device model.
引用
收藏
页码:2823 / 2828
页数:6
相关论文
共 50 条
  • [31] Towards On-device Learning on the Edge: Ways to Select Neurons to Update under a Budget Constraint
    Quelennec, Ael
    Tartaglione, Enzo
    Mozharovskyi, Pavlo
    Van-Tam Nguyen
    [J]. 2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 685 - 694
  • [32] TOWARDS ON-DEVICE KEYWORD SPOTTING USING LOW-FOOTPRINT QUATERNION NEURAL MODELS
    Chaudhary, Aryan
    Abrol, Vinayak
    [J]. 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [33] STREAMING ON-DEVICE DETECTION OF DEVICE DIRECTED SPEECH FROM VOICE AND TOUCH-BASED INVOCATION
    Rudovic, Ognjen
    Bindal, Akanksha
    Garg, Vineet
    Simha, Pramod
    Dighe, Pranay
    Kajarekar, Sachin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 491 - 495
  • [34] Cloud-based or On-device: An Empirical Study of Mobile Deep Inference
    Guo, Tian
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E 2018), 2018, : 184 - 190
  • [35] FeFET-Based Neuromorphic Architecture with On-Device Feedback Alignment Training
    Jo, Sumin
    Zyarah, Abdullah
    Kurinec, Santosh
    Ni, Kai
    Zohora, Fatima Tuz
    Kudithipudi, Dhireesha
    [J]. PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 317 - 322
  • [36] ATTENTION BASED ON-DEVICE STREAMING SPEECH RECOGNITION WITH LARGE SPEECH CORPUS
    Kim, Kwangyoun
    Lee, Kyungmin
    Gowda, Dhananjaya
    Park, Junmo
    Kim, Sungsoo
    Jin, Sichen
    Lee, Young-Yoon
    Yeo, Jinsu
    Kim, Daehyun
    Jung, Seokyeong
    Lee, Jungin
    Han, Myoungji
    Kim, Chanwoo
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 956 - 963
  • [37] On-Device Deep Learning for IoT-based Wireless Sensing Applications
    Lenka, Manoj
    Chakraborty, Ayon
    [J]. 2024 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS, PERCOM WORKSHOPS, 2024, : 568 - 574
  • [38] WiP: An On-device LLM-based Approach to Query Privacy Protection
    Yuan, Yizhen
    Kong, Rui
    Li, Yuanchun
    Liu, Yunxin
    [J]. PROCEEDINGS OF THE 2024 WORKSHOP ON EDGE AND MOBILE FOUNDATION MODELS, EDGEFM 2024, 2024, : 7 - 9
  • [39] LimitAccess: on-device TinyML based robust speech recognition and age classification
    Maayah M.
    Abunada A.
    Al-Janahi K.
    Ahmed M.E.
    Qadir J.
    [J]. Discover Artificial Intelligence, 2023, 3 (01):
  • [40] On-Device Implementation for Deep-Learning-Based Cognitive Activity Prediction
    Saini, Manali
    Satija, Udit
    [J]. IEEE SENSORS LETTERS, 2022, 6 (04)