Word Embeddings: What Works, What Doesn't, and How to Tell the Difference for Applied Research

被引:55
|
作者
Rodriguez, Pedro L. [1 ]
Spirling, Arthur [2 ]
机构
[1] Vanderbilt Univ, Data Sci Inst, Nashville, TN 37212 USA
[2] NYU, Polit & Data Sci, 550 1St Ave, New York, NY 10012 USA
来源
JOURNAL OF POLITICS | 2022年 / 84卷 / 01期
基金
美国国家科学基金会;
关键词
MODELS;
D O I
10.1086/715162
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
Word embeddings are becoming popular for political science research, yet we know little about their properties and performance. To help scholars seeking to use these techniques, we explore the effects of key parameter choices-including context window length, embedding vector dimensions, and pretrained versus locally fit variants on the efficiency and quality of inferences possible with these models. Reassuringly we show that results are generally robust to such choices for political corpora of various sizes and in various languages. Beyond reporting extensive technical findings, we provide a novel crowdsourced "Turing test"-style method for examining the relative performance of any two models that produce substantive, text-based outputs. Our results are encouraging: popular, easily available pretrained embeddings perform at a level close to-or surpassing-both human coders and more complicated locally fit models. For completeness, we provide best practice advice for cases where local fitting is required.
引用
收藏
页码:101 / 115
页数:15
相关论文
共 50 条
  • [1] Complementary Therapies for Cancer: what works, what doesn't-and how to tell the difference
    Beasley, Richard
    [J]. NEW ZEALAND MEDICAL JOURNAL, 2010, 123 (1327) : 171 - 172
  • [2] What Works and What Doesn't
    Tepas, Joseph J., III
    [J]. JOURNAL OF TRAUMA-INJURY INFECTION AND CRITICAL CARE, 2010, 69 (04): : S190 - S190
  • [3] What if 'What Works' Doesn't?
    Lehmann, Erika R.
    [J]. EVALUATION, 2015, 21 (02) : 167 - 172
  • [4] EVALUATING DISEASE FLARES FOR RESEARCH: WHAT WORKS AND WHAT DOESN'T?
    Jones, Gareth T.
    [J]. RHEUMATOLOGY, 2017, 56 : 21 - 21
  • [5] Interventions: what works, what doesn't?
    Reynolds, S
    [J]. OCCUPATIONAL MEDICINE-OXFORD, 2000, 50 (05): : 315 - 319
  • [6] Verification: What works and what doesn't
    Bacchini, F
    Damiano, R
    Bentley, B
    Baty, K
    Normoyle, K
    Ishii, M
    Yogev, E
    [J]. 41ST DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2004, 2004, : 274 - 274
  • [7] Translating Research to Policy for Sustainable Cities What Works and What Doesn't?
    Zborel, Tammy
    Holland, Brian
    Thomas, Gregg
    Baker, Lawrence
    Calhoun, Koben
    Ramaswami, Anu
    [J]. JOURNAL OF INDUSTRIAL ECOLOGY, 2012, 16 (06) : 786 - 788
  • [8] Teaching psychopharmacology - What works and what doesn't
    Zisook, Sidney
    Glick, Ira D.
    Jefferson, James W.
    Wagner, Karen Dineen
    Salzman, Carl
    Peselow, Eric D.
    Stahl, Stephen
    [J]. JOURNAL OF CLINICAL PSYCHOPHARMACOLOGY, 2008, 28 (01) : 96 - 100
  • [9] Getting to Diversity: What Works and What Doesn't
    Rao, Aliya Hamid
    [J]. AMERICAN JOURNAL OF SOCIOLOGY, 2024, 129 (04) : 1290 - 1292
  • [10] Getting to Diversity: What Works and What Doesn't
    Chang, Edward H.
    [J]. SCIENCE, 2022, 377 (6612) : 1271 - 1271