Word Embeddings: What Works, What Doesn't, and How to Tell the Difference for Applied Research

被引:55
|
作者
Rodriguez, Pedro L. [1 ]
Spirling, Arthur [2 ]
机构
[1] Vanderbilt Univ, Data Sci Inst, Nashville, TN 37212 USA
[2] NYU, Polit & Data Sci, 550 1St Ave, New York, NY 10012 USA
来源
JOURNAL OF POLITICS | 2022年 / 84卷 / 01期
基金
美国国家科学基金会;
关键词
MODELS;
D O I
10.1086/715162
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
Word embeddings are becoming popular for political science research, yet we know little about their properties and performance. To help scholars seeking to use these techniques, we explore the effects of key parameter choices-including context window length, embedding vector dimensions, and pretrained versus locally fit variants on the efficiency and quality of inferences possible with these models. Reassuringly we show that results are generally robust to such choices for political corpora of various sizes and in various languages. Beyond reporting extensive technical findings, we provide a novel crowdsourced "Turing test"-style method for examining the relative performance of any two models that produce substantive, text-based outputs. Our results are encouraging: popular, easily available pretrained embeddings perform at a level close to-or surpassing-both human coders and more complicated locally fit models. For completeness, we provide best practice advice for cases where local fitting is required.
引用
收藏
页码:101 / 115
页数:15
相关论文
共 50 条
  • [21] Artificial intelligence - What works and what doesn't?
    HayesRoth, F
    [J]. AI MAGAZINE, 1997, 18 (02) : 99 - 113
  • [22] Mob Programming - What Works, What Doesn't
    Wilson, Alexander
    [J]. AGILE PROCESSES, IN SOFTWARE ENGINEERING, AND EXTREME PROGRAMMING, XP 2015, 2015, 212 : 319 - 325
  • [23] MANAGING FIBROMYALGIA: WHAT WORKS AND WHAT DOESN'T
    Hauser, Winfried
    [J]. RHEUMATOLOGY, 2014, 53 : 9 - 9
  • [24] Global outsourcing: what works, what doesn't
    Heywood, Peter
    [J]. Data Communications, 1994, 23 (17):
  • [25] Results of pollicisation: what works and what doesn’t
    Michael Tonkin
    [J]. BMC Proceedings, 9 (Suppl 3)
  • [26] WHAT DXA DOESN'T TELL US
    Poole, Ken
    [J]. OSTEOPOROSIS INTERNATIONAL, 2018, 29 : 602 - 603
  • [27] What Works and What Doesn’t Work? The Challenges of Doing Effective Applied Conservation Research in Human-Modified Habitats
    Aimee S. Oxley
    Giuseppe Donati
    Catherine M. Hill
    [J]. International Journal of Primatology, 2022, 43 : 989 - 999
  • [28] What Works And What Doesn't Work? The Challenge of Creating Effective Applied Conservation Research in Human-Modified Habitats
    Oxley, Aimee S.
    Donati, Giuseppe
    Hill, Catherine M.
    [J]. FOLIA PRIMATOLOGICA, 2020, 91 (03) : 327 - 327
  • [29] What Works and What Doesn't Work? The Challenges of Doing Effective Applied Conservation Research in Human-Modified Habitats
    Oxley, Aimee S.
    Donati, Giuseppe
    Hill, Catherine M.
    [J]. INTERNATIONAL JOURNAL OF PRIMATOLOGY, 2022, 43 (06) : 989 - 999
  • [30] Veterinary Marketing-What Works and What Doesn't
    Bankstahl, Thomas
    Bourke, Ann
    Crupi, Louis
    Greenberg, Tia
    Rhody, Jeff
    Velasco, Michele
    [J]. JOURNAL OF AVIAN MEDICINE AND SURGERY, 2009, 23 (03) : 222 - 226