Leveraging Social Media Signals for Record Linkage

被引:4
|
作者
Schneider, Andrew T. [1 ]
Mukherjee, Arjun [2 ]
Dragut, Eduard C. [1 ]
机构
[1] Temple Univ, Philadelphia, PA 19122 USA
[2] Univ Houston, Houston, TX 77004 USA
基金
美国国家科学基金会;
关键词
Record Linkage; Word Embeddings; Word Mover's Distance;
D O I
10.1145/3178876.3186018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Many data-intensive applications collect (structured) data from a variety of sources. A key task in this process is record linkage, which is the problem of determining the records from these sources that refer to the same real-world entities. Traditional approaches use the record representation of entities to accomplish this task. With the nascence of social media, entities on the Web are now accompanied by user generated content. We present a method for record linkage that uses this hitherto untapped source of entity information. We use document-based distances, with an emphasis on word embedding document distances, to determine if two entities match. Our rationale is that user evaluations of entities converge in semantic content, and hence in the word embedded space, as the number of user evaluations grows. We analyze the effectiveness of the proposed method both as a stand-alone method and in combination with record-based record linkage methods. Experimental results using real-world reviews demonstrate the high effectiveness of our approach. To our knowledge, this is the first work exploring the use of user generated content accompanying entities in the record linkage task.
引用
收藏
页码:1195 / 1204
页数:10
相关论文
共 50 条
  • [1] Leveraging ECG signals and social media for stress detection
    Feng, Zhuonan
    Li, Ningyun
    Feng, Ling
    Chen, Diyi
    Zhu, Changhong
    [J]. BEHAVIOUR & INFORMATION TECHNOLOGY, 2021, 40 (02) : 116 - 133
  • [2] Leveraging Social Media Analytics for Physicians
    Woo, Benjamin K. P.
    Lu, Hanson T.
    [J]. ACADEMIC MEDICINE, 2023, 98 (02) : 156 - 157
  • [3] Leveraging Social Media in #FamilyNursing Practice
    Schroeder, Wilma K.
    [J]. JOURNAL OF FAMILY NURSING, 2017, 23 (01) : 55 - 72
  • [4] Password Reinforcement Leveraging Social Media
    Brill, Callum
    Olmsted, Aspen
    [J]. INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 135 - 136
  • [5] Leveraging social media networks for classification
    Lei Tang
    Huan Liu
    [J]. Data Mining and Knowledge Discovery, 2011, 23 : 447 - 478
  • [6] Leveraging social media for mentorship in surgery
    Baskin, Alison
    Sosa, Julie Ann
    [J]. SURGERY, 2023, 174 (02) : 395 - 397
  • [7] Leveraging social media analytics for startups
    Dev, Jayati
    [J]. 1600, Association for Computing Machinery (27): : 72 - 73
  • [8] Leveraging social media networks for classification
    Tang, Lei
    Liu, Huan
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 23 (03) : 447 - 478
  • [9] Leveraging Social Foci for Information Seeking in Social Media
    Ranganath, Suhas
    Tang, Jiliang
    Hu, Xia
    Sundaram, Hari
    Liu, Huan
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 261 - 267
  • [10] Online survey design and social media Leveraging social media for knowledge management
    Dodemaide, Paul
    Joubert, Lynette
    Hill, Nicole
    Merolli, Mark
    [J]. PROCEEDINGS OF THE AUSTRALASIAN COMPUTER SCIENCE WEEK MULTICONFERENCE (ACSW 2020), 2020,