Scaling laws for language encoding models in fMRI

被引:0
|
作者
Antonello, Richard J. [1 ]
Vaidya, Aditya R. [1 ]
Huth, Alexander G. [1 ,2 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[2] Univ Texas Austin, Dept Neurosci, Austin, TX 78712 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Representations from transformer-based unidirectional language models are known to be effective at predicting brain responses to natural language. However, most studies comparing language models to brains have used GPT-2 or similarly sized language models. Here we tested whether larger open-source models such as those from the OPT and LLaMA families are better at predicting brain responses recorded using fMRI. Mirroring scaling results from other contexts, we found that brain prediction performance scales logarithmically with model size from 125M to 30B parameter models, with similar to 15% increased encoding performance as measured by correlation with a held-out test set across 3 subjects. Similar logarithmic behavior was observed when scaling the size of the fMRI training set. We also characterized scaling for acoustic encoding models that use HuBERT, WavLM, and Whisper, and we found comparable improvements with model size. A noise ceiling analysis of these large, high-performance encoding models showed that performance is nearing the theoretical maximum for brain areas such as the precuneus and higher auditory cortex. These results suggest that increasing scale in both models and data will yield incredibly effective models of language processing in the brain, enabling better scientific understanding as well as applications such as decoding.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Incorporating Context into Language Encoding Models for fMRI
    Jain, Shailee
    Huth, Alexander G.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Unified Scaling Laws for Routed Language Models
    Clark, Aidan
    de las Casas, Diego
    Guy, Aurelia
    Mensch, Arthur
    Paganini, Michela
    Hoffmann, Jordan
    Damoc, Bogdan
    Hechtman, Blake
    Cai, Trevor
    Borgeaud, Sebastian
    van den Driessche, George
    Rutherford, Eliza
    Hennigan, Tom
    Johnson, Matthew
    Millican, Katie
    Cassirer, Albin
    Jones, Chris
    Jones, Chris
    Buchatskaya, Elena
    Budden, David
    Sifre, Laurent
    Osindero, Simon
    Vinyals, Oriol
    Rae, Jack
    Elsen, Erich
    Kavukcuoglu, Koray
    Simonyan, Karen
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [3] A natural language fMRI dataset for voxelwise encoding models
    Amanda LeBel
    Lauren Wagner
    Shailee Jain
    Aneesh Adhikari-Desai
    Bhavin Gupta
    Allyson Morgenthal
    Jerry Tang
    Lixiang Xu
    Alexander G. Huth
    Scientific Data, 10
  • [4] A natural language fMRI dataset for voxelwise encoding models
    Lebel, Amanda
    Wagner, Lauren
    Jain, Shailee
    Adhikari-Desai, Aneesh
    Gupta, Bhavin
    Morgenthal, Allyson
    Tang, Jerry
    Xu, Lixiang
    Huth, Alexander G.
    SCIENTIFIC DATA, 2023, 10 (01)
  • [5] Scaling Laws for Generative Mixed-Modal Language Models
    Aghajanyan, Armen
    Yu, Lili
    Conneau, Alexis
    Hsu, Wei-Ning
    Hambardzumyan, Karen
    Zhang, Susan
    Roller, Stephen
    Goyal, Naman
    Levy, Omer
    Zettlemoyer, Luke
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 265 - 279
  • [6] Scaling Laws for Acoustic Models
    Droppo, Jasha
    Elibol, Oguz
    INTERSPEECH 2021, 2021, : 2576 - 2580
  • [7] Turbulence scaling laws and transport models
    Garbet, X.
    TURBULENT TRANSPORT IN FUSION PLASMA, 2008, 1013 : 287 - +
  • [8] Revisiting Neural Scaling Laws in Language and Vision
    Alabdulmohsin, Ibrahim
    Neyshabur, Behnam
    Zhai, Xiaohua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Scaling Smoothed Language Models
    A. Varona
    I. Torres
    International Journal of Speech Technology, 2005, 8 (4) : 341 - 361
  • [10] Scaling Smoothed Language Models
    Varona, A.
    Torres, I.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (04) : 341 - 361