NarrativeXL: a Large-scale Dataset for Long-Term Memory Models

被引:0
|
作者
Moskvichev, Arseny [1 ]
Mai, Ky-Vinh [2 ]
机构
[1] Santa Fe Inst, Santa Fe, NM 87501 USA
[2] Univ Calif Irvine, Irvine, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new large-scale (nearly a million questions) ultra-long-context (more than 50,000 words average document length) reading comprehension dataset. Using GPT 3.5, we summarized each scene in 1,500 hand-curated fiction books from Project Gutenberg, which resulted in approximately 150 scene-level summaries per book. After that, we created a number of reading comprehension questions based on these summaries, including three types of multiple-choice scene recognition questions, as well as free-form narrative reconstruction questions. With 990,595 total questions, our dataset is an order of magnitude larger than the closest alternatives. Crucially, most questions have a known "retention demand", indicating how long-term of a memory is needed to answer them, which should aid long-term memory performance evaluation. We validate our data in four small-scale experiments: one with human labelers, and three with existing language models. We show that our questions 1) adequately represent the source material 2) can be used to diagnose a model's memory capacity 3) are not trivial for modern language models even when the memory demand does not exceed those models' context lengths. Lastly, we provide our code which can be used to further expand the dataset with minimal human labor.
引用
收藏
页码:15058 / 15072
页数:15
相关论文
共 50 条
  • [31] Large-Scale Heterogeneity and Long-Term Relaxation in Aluminum–REM Melts
    Yagodin D.A.
    Son L.D.
    Russian Metallurgy (Metally), 2023, 2023 (08) : 1129 - 1132
  • [32] Long-term and large-scale viscous evolution of dense planetary rings
    Salmon, J.
    Charnoz, S.
    Crida, A.
    Brahic, A.
    ICARUS, 2010, 209 (02) : 771 - 785
  • [33] LONG-TERM AND SEASONAL LARGE-SCALE DISTURBANCES OF A SMALL LOWLAND STREAM
    OCONNOR, NA
    LAKE, PS
    AUSTRALIAN JOURNAL OF MARINE AND FRESHWATER RESEARCH, 1994, 45 (02): : 243 - 255
  • [34] Long-term operation experiences with large-scale solar systems in Slovenia
    Arkar, C
    Medved, S
    Novak, P
    RENEWABLE ENERGY, 1999, 16 (1-4) : 669 - 672
  • [35] STUDY ON THE LONG-TERM INCENTIVE MECHANISM OF THE LARGE-SCALE DREDGING PROJECT
    Zhou, Bin
    Zhang, Zigang
    ICEIS 2011: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 4, 2011, : 574 - 580
  • [36] A large-scale, long-term study of scale drift: The micro view and the macro view
    He, W.
    Li, S.
    Kingsbury, G. G.
    2016 JOINT IMEKO TC1-TC7-TC13 SYMPOSIUM: METROLOGY ACROSS THE SCIENCES: WISHFUL THINKING?, 2016, 772
  • [37] LARGE-SCALE AND LONG-TERM COMPENSATION AGREEMENTS IN EAST-WEST-TRADE
    不详
    ECONOMIC BULLETIN FOR EUROPE, 1982, 34 (02): : 171 - 195
  • [38] Chronicles of nature calendar, a long-term and large-scale multitaxon database on phenology
    Ovaskainen, Otso
    Meyke, Evgeniy
    Lo, Coong
    Tikhonov, Gleb
    Delgado, Maria del Mar
    Roslin, Tomas
    Gurarie, Eliezer
    Abadonova, Marina
    Abduraimov, Ozodbek
    Adrianova, Olga
    Akimova, Tatiana
    Akkiev, Muzhigit
    Ananin, Aleksandr
    Andreeva, Elena
    Andriychuk, Natalia
    Antipin, Maxim
    Arzamascev, Konstantin
    Babina, Svetlana
    Babushkin, Miroslav
    Bakin, Oleg
    Barabancova, Anna
    Basilskaja, Inna
    Belova, Nina
    Belyaeva, Natalia
    Bespalova, Tatjana
    Bisikalova, Evgeniya
    Bobretsov, Anatoly
    Bobrov, Vladimir
    Bobrovskyi, Vadim
    Bochkareva, Elena
    Bogdanov, Gennady
    Bolshakov, Vladimir
    Bondarchuk, Svetlana
    Bukharova, Evgeniya
    Butunina, Alena
    Buyvolov, Yuri
    Buyvolova, Anna
    Bykov, Yuri
    Chakhireva, Elena
    Chashchina, Olga
    Cherenkova, Nadezhda
    Chistjakov, Sergej
    Chuhontseva, Svetlana
    Davydov, Evgeniy A.
    Demchenko, Viktor
    Diadicheva, Elena
    Dobrolyubov, Aleksandr
    Dostoyevskaya, Ludmila
    Drovnina, Svetlana
    Drozdova, Zoya
    SCIENTIFIC DATA, 2020, 7 (01)
  • [39] Large-Scale Flexible Fabric Biosensor for Long-Term Monitoring of Sweat Lactate
    Chen, Yangyang
    Hu, Xiaokang
    Liang, Qimin
    Wang, Xin
    Zhang, Huanlei
    Jia, Kangkang
    Li, Yuan
    Zhang, Anning
    Chen, Peining
    Lin, Meng
    Qiu, Longbin
    Peng, Huisheng
    He, Sisi
    ADVANCED FUNCTIONAL MATERIALS, 2024, 34 (36)
  • [40] Job Characteristics on Large-Scale Systems: Long-Term Analysis, Quantification, and Implications
    Patel, Tirthak
    Liu, Zhengchun
    Kettimuthu, Raj
    Rich, Paul
    Allcock, William
    Tiwari, Devesh
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,