A Latent Topic Model for Complete Entity Resolution

被引:0
|
作者
Shu, Liangcai [1 ]
Long, Bo [1 ]
Meng, Weiyi [1 ]
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
关键词
DISTRIBUTIONS;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In bibliographies like DBLP and Citeseer, there are three kinds of entity-name problems that need to be solved. First, multiple entities share one name, which is called the name sharing problem. Second, one entity has different names, which is called the name variant problem. Third, multiple entities share multiple names, which is called the name mixing problem. We aim to solve these problems based on one model in this paper. We call this task complete entity resolution. Different from previous work, our work use global information based on data with two types of information, words and author names. We propose a generative latent topic model that involves both author names and words - the LDA-dual model, by extending the LDA (Latent Dirichlet Allocation) model. We also propose a method to obtain model parameters that is global information. Based on obtained model parameters, we propose two algorithms to solve the three problems mentioned above. Experimental results demonstrate the effectiveness and great potential of the proposed model and algorithms.
引用
收藏
页码:880 / 891
页数:12
相关论文
共 50 条
  • [21] IPHITS: An Incremental Latent Topic Model for Link Structure
    Ma, Huifang
    Zhao, Weizhong
    Li, Zhixin
    Shi, Zhongzhi
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 242 - +
  • [22] Learning Latent Topic Information for Language Model Adaptation
    Lu, Shixiang
    Wei, Wei
    Fu, Xiaoyin
    Fan, Lichun
    Xu, Bo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, 2012, 333 : 143 - 153
  • [23] GPLDA: A Generalized Poisson Latent Dirichlet Topic Model
    Bala, Ibrahim Bakari
    Saringat, Mohd Zainuri
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (12) : 403 - 407
  • [24] Shared Parts Latent Topic Model for Image Classification
    Bin, Yang
    ADVANCED MATERIALS AND INFORMATION TECHNOLOGY PROCESSING, PTS 1-3, 2011, 271-273 : 1257 - +
  • [25] A hierarchical latent topic model based on sparse coding
    Zhu, Wenjun
    Zhang, Liqing
    Bian, Qianwei
    NEUROCOMPUTING, 2012, 76 (01) : 28 - 35
  • [26] LATENT TOPIC VISUAL LANGUAGE MODEL FOR OBJECT CATEGORIZATION
    Wu, Lei
    Yu, Nenghai
    Liu, Jing
    Li, Mingjing
    2011 PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS (SIGMAP 2011), 2011,
  • [27] A Temporal Latent Topic Model for Facial Expression Recognition
    Shang, Lifeng
    Chan, Kwok-Ping
    COMPUTER VISION - ACCV 2010, PT IV, 2011, 6495 : 51 - 63
  • [28] Online Topic-Aware Entity Resolution Over Incomplete Data Streams
    Ren, Weilong
    Lian, Xiang
    Ghazinour, Kambiz
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1478 - 1490
  • [29] Topic Model Based Knowledge Graph for Entity Similarity Measuring
    Sun, Haoran
    Ren, Rui
    Cai, Hongming
    Xu, Boyi
    Liu, Yonggang
    Li, Tongyu
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 94 - 101
  • [30] Proposal of Network Generation Model based on Latent Preference Topic
    Akayama, Ikuto
    Hijikata, Yoshinori
    Kuramochi, Toshiya
    Sakata, Nobuchika
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2018), 2018,