Word Embeddings for the Analysis of Ideological Placement in Parliamentary Corpora

被引:61
|
作者
Rheault, Ludovic [1 ,2 ]
Cochrane, Christopher [1 ]
机构
[1] Univ Toronto, Dept Polit Sci, Toronto, ON, Canada
[2] Univ Toronto, Munk Sch Global Affairs & Publ Policy, Toronto, ON, Canada
关键词
word embeddings; parliamentary corpora; text as data; political ideology; natural language processing; POSITIONS; LANGUAGE; PARTIES; TEXT;
D O I
10.1017/pan.2019.26
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
Word embeddings, the coefficients from neural network models predicting the use of words in context, have now become inescapable in applications involving natural language processing. Despite a few studies in political science, the potential of this methodology for the analysis of political texts has yet to be fully uncovered. This paper introduces models of word embeddings augmented with political metadata and trained on large-scale parliamentary corpora from Britain, Canada, and the United States. We fit these models with indicator variables of the party affiliation of members of parliament, which we refer to as party embeddings. We illustrate how these embeddings can be used to produce scaling estimates of ideological placement and other quantities of interest for political research. To validate the methodology, we assess our results against indicators from the Comparative Manifestos Project, surveys of experts, and measures based on roll-call votes. Our findings suggest that party embeddings are successful at capturing latent concepts such as ideology, and the approach provides researchers with an integrated framework for studying political language.
引用
下载
收藏
页码:112 / 133
页数:22
相关论文
共 50 条
  • [1] Asynchronous Training of Word Embeddings for Large Text Corpora
    Anand, Avishek
    Khosla, Megha
    Singh, Jaspreet
    Zab, Jan-Hendrik
    Zhang, Zijian
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 168 - 176
  • [2] Enriching Word Embeddings with a Regressor Instead of Labeled Corpora
    Abdalla, Mohamed
    Sahlgren, Magnus
    Hirst, Graeme
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6188 - 6195
  • [3] Word Alignment by Fine-tuning Embeddings on Parallel Corpora
    Dou, Zi-Yi
    Neubig, Graham
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2112 - 2128
  • [4] The Impact of Specialized Corpora for Word Embeddings in Natural Langage Understanding
    Neuraz, Antoine
    Rance, Bastien
    Garcelon, Nicolas
    Llanos, Leonardo Campillos
    Burgun, Anita
    Rosset, Sophie
    DIGITAL PERSONALIZED HEALTH AND MEDICINE, 2020, 270 : 432 - 436
  • [5] HistorEx: Exploring Historical Text Corpora Using Word and Document Embeddings
    Mueller, Sven
    Brunzel, Michael
    Kaun, Daniela
    Biswas, Russa
    Koutraki, Maria
    Tietz, Tabea
    Sack, Harald
    SEMANTIC WEB: ESWC 2019 SATELLITE EVENTS, 2019, 11762 : 136 - 140
  • [6] Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings
    Tan, Luchen
    Zhang, Haotian
    Clarke, Charles L. A.
    Smucker, Mark D.
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 657 - 661
  • [7] Improving Biomedical Information Extraction with Word Embeddings Trained on Closed-Domain Corpora
    Silvestri, Stefano
    Gargiulo, Francesco
    Ciampi, Mario
    2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 1129 - 1134
  • [8] Deep learning in law: early adaptation and legal word embeddings trained on large corpora
    Chalkidis, Ilias
    Kampas, Dimitrios
    ARTIFICIAL INTELLIGENCE AND LAW, 2019, 27 (02) : 171 - 198
  • [9] Deep learning in law: early adaptation and legal word embeddings trained on large corpora
    Ilias Chalkidis
    Dimitrios Kampas
    Artificial Intelligence and Law, 2019, 27 : 171 - 198
  • [10] Inferring Multilingual Domain-Specific Word Embeddings From Large Document Corpora
    Cagliero, Luca
    La Quatra, Moreno
    IEEE ACCESS, 2021, 9 : 137309 - 137321