Comparing Text Representations: A Theory-Driven Approach

被引:0
|
作者
Yauney, Gregory [1 ]
Mimno, David [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Much of the progress in contemporary NLP has come from learning representations, such as masked language model (MLM) contextual embeddings, that turn challenging problems into simple classification tasks. But how do we quantify and explain this effect? We adapt general tools from computational learning theory to fit the specific characteristics of text datasets and present a method to evaluate the compatibility between representations and tasks. Even though many tasks can be easily solved with simple bag-of-words (BOW) representations, BOW does poorly on hard natural language inference tasks. For one such task we find that BOWcannot distinguish between real and randomized labelings, while pre-trained MLM representations show 72x greater distinction between real and random labelings than BOW. This method provides a calibrated, quantitative measure of the difficulty of a classification-based NLP task, enabling comparisons between representations without requiring empirical evaluations that may be sensitive to initializations and hyperparameters. The method provides a fresh perspective on the patterns in a dataset and the alignment of those patterns with specific labels.
引用
收藏
页码:5527 / 5539
页数:13
相关论文
共 50 条
  • [41] THEORY-DRIVEN EVALUATIONS - CHEN,HT
    COSTNER, HL
    CONTEMPORARY SOCIOLOGY-A JOURNAL OF REVIEWS, 1991, 20 (01) : 92 - 94
  • [42] A theory-driven evaluation of a wellness initiative
    Field, Carren
    Louw, Johann
    SA JOURNAL OF HUMAN RESOURCE MANAGEMENT, 2012, 10 (03)
  • [43] A theory-driven model of handshape similarity
    Keane, Jonathan
    Sehyr, Zed Sevcikova
    Emmorey, Karen
    Brentari, Diane
    PHONOLOGY, 2017, 34 (02) : 221 - 241
  • [44] THEORY-DRIVEN EVALUATIONS - CHEN,HT
    LIPSEY, MW
    EVALUATION AND PROGRAM PLANNING, 1991, 14 (04) : 412 - 414
  • [45] THEORY-DRIVEN EVALUATIONS - CHEN,HT
    BICKEL, WE
    CONTEMPORARY PSYCHOLOGY, 1992, 37 (03): : 241 - 242
  • [46] Comparing Theory-Driven and Data-Driven Attractiveness Models Using Images of Real Women's Faces
    Holzleitner, Iris J.
    Lee, Anthony J.
    Hahn, Amanda C.
    Kandrik, Michal
    Bovet, Jeanne
    Renoult, Julien P.
    Simmons, David
    Garrod, Oliver
    DeBruine, Lisa M.
    Jones, Benedict C.
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2019, 45 (12) : 1589 - 1595
  • [47] Extracting organizational culture from text: the development and validation of a theory-driven tool for digital data
    Schachner, Michael
    Ardag, M. Murat
    Holtz, Peter
    Grosser, Johannes
    Hartz, Carina
    van Herk, Hester
    Bender, Michael
    Boehnke, Klaus
    Dobewall, Henrik
    EUROPEAN JOURNAL OF WORK AND ORGANIZATIONAL PSYCHOLOGY, 2024, 33 (05) : 571 - 582
  • [48] Theory-driven approach to hand hygiene promotion intervention in hospitals: a case of theory of planned behaviour
    Barekati, Hassan
    Rakhshanderou, Sakineh
    Mehrabi, Yadollah
    Mazar, Leili
    Ghaffari, Mohtasham
    HEALTH EDUCATION RESEARCH, 2025, 40 (02)
  • [49] A THEORY-DRIVEN APPROACH TO THE EVALUATION OF PROFESSIONAL-TRAINING IN ALCOHOL-ABUSE
    GORMAN, DM
    ADDICTION, 1993, 88 (02) : 229 - 236
  • [50] Gender equality and smoking: a theory-driven approach to smoking gender differences in Spain
    Bilal, Usama
    Beltran, Paula
    Fernandez, Esteve
    Navas-Acien, Ana
    Bolumar, Francisco
    Franco, Manuel
    TOBACCO CONTROL, 2016, 25 (03) : 295 - 300