Scaling Data from Multiple Sources

被引:0
|
作者
Enamorado, Ted [1 ]
Lopez-Moctezuma, Gabriel [2 ]
Ratkovic, Marc [3 ]
机构
[1] Washington Univ, Dept Polit Sci, St Louis, MO 63130 USA
[2] CALTECH, Div Humanities & Social Sci, Pasadena, CA 91125 USA
[3] Princeton Univ, Dept Polit, Princeton, NJ 08544 USA
关键词
multidimensional scaling; principal component analysis; U; S; Senate; BAYESIAN FACTOR-ANALYSIS; MODELS; PREFERENCES; LIKELIHOOD; FRAMEWORK;
D O I
10.1017/pan.2020.24
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
We introduce a method for scaling two datasets from different sources. The proposed method estimates a latent factor common to both datasets as well as an idiosyncratic factor unique to each. In addition, it offers a flexible modeling strategy that permits the scaled locations to be a function of covariates, and efficient implementation allows for inference through resampling. A simulation study shows that our proposed method improves over existing alternatives in capturing the variation common to both datasets, as well as the latent factors specific to each. We apply our proposed method to vote and speech data from the 112th U.S. Senate. We recover a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives, while recovering the words most associated with each senator's location. In addition, we estimate a word-specific subspace that ranges from national security to budget concerns, and a vote-specific subspace with Tea Party senators on one extreme and senior committee leaders on the other.
引用
收藏
页码:212 / 235
页数:24
相关论文
共 50 条
  • [1] Model Performance Scaling with Multiple Data Sources
    Hashimoto, Tatsunori
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [2] Certifying data from multiple sources
    Nuckolls, G
    Martel, C
    Stubblebine, SG
    DATA AND APPLICATIONS SECURITY XVII: STATUS AND PROSPECTS, 2004, 142 : 47 - 60
  • [3] Gather data from multiple sources
    Hill, J
    JOURNAL OF FAMILY PRACTICE, 2004, 53 (05): : 416 - 416
  • [4] Use of multiple classifiers in classification of data from multiple data sources
    Briem, GJ
    Benediktsson, JA
    Sveinsson, JR
    IGARSS 2001: SCANNING THE PRESENT AND RESOLVING THE FUTURE, VOLS 1-7, PROCEEDINGS, 2001, : 882 - 884
  • [5] Review on mining data from multiple data sources
    Wang, Ruili
    Ji, Wanting
    Liu, Mingzhe
    Wang, Xun
    Weng, Jian
    Deng, Song
    Gao, Suying
    Yuan, Chang-an
    PATTERN RECOGNITION LETTERS, 2018, 109 : 120 - 128
  • [6] Discovery of classifications from data of multiple sources
    Wen, JH
    Ling, C
    Yang, Q
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2281 - 2286
  • [7] Discovering classification from data of multiple sources
    Ling, CX
    Yang, Q
    DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 12 (2-3) : 181 - 201
  • [8] Dealing with Data from Multiple Web Sources
    Batista, Natercia A.
    Brandao, Michele A.
    Pinheiro, Michele B.
    Dalip, Daniel H.
    Moro, Mirella M.
    WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2018, : 3 - 6
  • [9] COMPARISON OF EPIDEMIOLOGIC DATA FROM MULTIPLE SOURCES
    HORWITZ, RI
    JOURNAL OF CHRONIC DISEASES, 1986, 39 (11): : 889 - 896
  • [10] LEARNING FROM MULTIPLE SOURCES OF INACCURATE DATA
    BALIGA, G
    JAIN, S
    SHARMA, A
    LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, 1992, 642 : 108 - 128