Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

被引:0
|
作者
Ang, Phyllis [1 ]
Dhingra, Bhuwan [1 ]
Wills, Lisa Wu [1 ]
机构
[1] Duke Univ, Durham, NC 27708 USA
关键词
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic study of this accuracy vs. efficiency trade-off on two widely used long-sequence models - Longformer-Encoder-Decoder (LED) and Big Bird - during fine-tuning and inference on four datasets from the SCROLLS benchmark. To study how this trade-off differs across hyperparameter settings, we compare the models across four sequence lengths (1024, 2048, 3072, 4096) and two model sizes (base and large) under a fixed resource budget. We find that LED consistently achieves better accuracy at lower energy costs than Big Bird. For summarization, we find that increasing model size is more energy efficient than increasing sequence length for higher accuracy. However, this comes at the cost of a large drop in inference speed. For question answering, we find that smaller models are both more efficient and more accurate due to the larger training batch sizes possible under a fixed resource budget.
引用
收藏
页码:113 / 121
页数:9
相关论文
共 50 条
  • [31] A Design Approach to Maximize the Efficiency vs. Linearity Trade-Off in Fixed and Modulated Load GaN Power Amplifiers
    Giofre, Rocco
    Colantonio, Paolo
    Giannini, Franco
    IEEE ACCESS, 2018, 6 : 9247 - 9255
  • [32] An FPU design template to optimize the accuracy-efficiency-area trade-off
    Zoni, Davide
    Galimberti, Andrea
    Fornaciari, William
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2021, 29
  • [33] On trade-off between computational efficiency and prediction accuracy in bandwidth traffic estimation
    Loumiotis, I.
    Adamopoulou, E.
    Demestichas, K.
    Stamatiadi, T.
    Theologou, M. E.
    ELECTRONICS LETTERS, 2014, 50 (10) : 754 - U137
  • [34] A DYNAMIC SPEED VS. ACCURACY TRADE-OFF (DSAT) PARADIGM FOR MEASURING AND TRAINING GRIP FORCE CONTROL FOR STROKE POPULATION
    Kim, Nam H.
    Wininger, Michael
    Forrest, Gail
    Edwards, Thomas
    Craelius, William
    PROCEEDINGS OF THE ASME SUMMER BIOENGINEERING CONFERENCE - 2009, PT A AND B, 2009, : 727 - 728
  • [35] Supply chain dynamics: analysis of inventory vs. order oscillations trade-off
    Villegas, FA
    Smith, NR
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2006, 44 (06) : 1037 - 1054
  • [36] Food availability and the nocturnal vs. diurnal foraging trade-off in juvenile salmon
    Metcalfe, NB
    Fraser, NHC
    Burns, MD
    JOURNAL OF ANIMAL ECOLOGY, 1999, 68 (02) : 371 - 381
  • [37] Understanding the Energy vs. Adversarial Robustness Trade-Off in Deep Neural Networks
    Lee, Kyungmi
    Chandrakasan, Anantha P.
    2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 46 - 51
  • [38] Understanding the Energy vs. Adversarial Robustness Trade-Off in Deep Neural Networks
    Lee, Kyungmi
    Chandrakasan, Anantha P.
    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS, 2021, 2 : 843 - 855
  • [39] The Trade-off of Applying Simple vs. Advanced Imputation Techniques in Prediction Modeling
    Kartoun, Uri
    JOURNAL OF MEDICAL SYSTEMS, 2019, 43 (05)
  • [40] National Sovereignty vs. International Cooperation: Policy Choices in Trade-Off Situations
    Emmenegger, Patrick
    Hausermann, Silja
    Walter, Stefanie
    SWISS POLITICAL SCIENCE REVIEW, 2018, 24 (04) : 400 - 422