Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

被引:0
|
作者
Ang, Phyllis [1 ]
Dhingra, Bhuwan [1 ]
Wills, Lisa Wu [1 ]
机构
[1] Duke Univ, Durham, NC 27708 USA
关键词
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic study of this accuracy vs. efficiency trade-off on two widely used long-sequence models - Longformer-Encoder-Decoder (LED) and Big Bird - during fine-tuning and inference on four datasets from the SCROLLS benchmark. To study how this trade-off differs across hyperparameter settings, we compare the models across four sequence lengths (1024, 2048, 3072, 4096) and two model sizes (base and large) under a fixed resource budget. We find that LED consistently achieves better accuracy at lower energy costs than Big Bird. For summarization, we find that increasing model size is more energy efficient than increasing sequence length for higher accuracy. However, this comes at the cost of a large drop in inference speed. For question answering, we find that smaller models are both more efficient and more accurate due to the larger training batch sizes possible under a fixed resource budget.
引用
收藏
页码:113 / 121
页数:9
相关论文
共 50 条
  • [1] Accuracy vs. complexity: A trade-off in visual question answering models
    Farazi, Moshiur
    Khan, Salman
    Barnes, Nick
    PATTERN RECOGNITION, 2021, 120 (120)
  • [2] Overcoming the Accuracy vs. Performance Trade-off in Oscillator Ising Machines
    Mallick, A.
    Bashar, M. K.
    Truesdell, D. S.
    Calhoun, B. H.
    Shukla, N.
    2021 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2021,
  • [3] Accuracy vs. Complexity Trade-off in Simulations of Future Wireless Networks
    Galiotto, Carlo
    Crowley, Heather
    Marchetti, Nicola
    Doyle, Linda
    2015 IEEE 26TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2015, : 1687 - 1691
  • [4] Energy efficiency vs. programmability trade-off: Architectures and design principles
    Robelly, J. P.
    Seidel, H.
    Chen, K. C.
    Fettweis, G.
    2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 585 - +
  • [5] Energy Efficiency vs. Throughput Trade-Off in an LTE-A Scenario
    Mihaylov, Mihail
    Mihovska, Albena
    Prasad, Ramjee
    Semov, Plamen
    Poulkov, Vladimir
    2014 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, VEHICULAR TECHNOLOGY, INFORMATION THEORY AND AEROSPACE & ELECTRONIC SYSTEMS (VITAE), 2014,
  • [6] Artificial neurovascular network (ANVN) to study the accuracy vs. efficiency trade-off in an energy dependent neural network
    Bhadra S. Kumar
    Nagavarshini Mayakkannan
    N. Sowmya Manojna
    V. Srinivasa Chakravarthy
    Scientific Reports, 11
  • [7] Artificial neurovascular network (ANVN) to study the accuracy vs. efficiency trade-off in an energy dependent neural network
    Kumar, Bhadra S.
    Mayakkannan, Nagavarshini
    Manojna, N. Sowmya
    Chakravarthy, V. Srinivasa
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [8] EGAN: A Framework for Exploring the Accuracy vs. Energy Efficiency Trade-off in Hardware Implementation of Error Resilient Applications
    Vaeztourshizi, Marzieh
    Kamal, Mehdi
    Pedram, Massoud
    PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 438 - 443
  • [9] Personalization vs. privacy: An inevitable trade-off?
    Garcia-Rivadulla, Sandra
    IFLA JOURNAL-INTERNATIONAL FEDERATION OF LIBRARY ASSOCIATIONS, 2016, 42 (03): : 227 - 238
  • [10] Managing the Quality vs. Efficiency Trade-off Using Dynamic Effort Scaling
    Chippa, Vinay K.
    Roy, Kaushik
    Chakradhar, Srimat T.
    Raghunathan, Anand
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2013, 12