Crowd labeling latent Dirichlet allocation

被引:0
|
作者
Luca Pion-Tonachini
Scott Makeig
Ken Kreutz-Delgado
机构
[1] University of California at San Diego,Department of Electrical and Computer Engineering
[2] University of California at San Diego,Swartz Center for Computational Neuroscience
[3] University of California at San Diego,Calit2/QI Pattern Recognition Laboratory
来源
关键词
Crowd labeling; Generative model; Bayesian; Latent Dirichlet allocation; EEG;
D O I
暂无
中图分类号
学科分类号
摘要
Large, unlabeled datasets are abundant nowadays, but getting labels for those datasets can be expensive and time-consuming. Crowd labeling is a crowdsourcing approach for gathering such labels from workers whose suggestions are not always accurate. While a variety of algorithms exist for this purpose, we present crowd labeling latent Dirichlet allocation (CL-LDA), a generalization of latent Dirichlet allocation that can solve a more general set of crowd labeling problems. We show that it performs as well as other methods and at times better on a variety of simulated and actual datasets while treating each label as compositional rather than indicating a discrete class. In addition, prior knowledge of workers’ abilities can be incorporated into the model through a structured Bayesian framework. We then apply CL-LDA to the EEG independent component labeling dataset, using its generalizations to further explore the utility of the algorithm. We discuss prospects for creating classifiers from the generated labels.
引用
收藏
页码:749 / 765
页数:16
相关论文
共 50 条
  • [1] Crowd labeling latent Dirichlet allocation
    Pion-Tonachini, Luca
    Makeig, Scott
    Kreutz-Delgado, Ken
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (03) : 749 - 765
  • [2] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [3] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 601 - 608
  • [4] Sequential latent Dirichlet allocation
    Du, Lan
    Buntine, Wray
    Jin, Huidong
    Chen, Changyou
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 31 (03) : 475 - 503
  • [5] Collective Latent Dirichlet Allocation
    Shen, Zhi-Yong
    Sun, Jun
    Shen, Yi-Dong
    [J]. ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 1019 - 1024
  • [6] The Security of Latent Dirichlet Allocation
    Mei, Shike
    Zhu, Xiaojin
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 681 - 689
  • [7] Sequential latent Dirichlet allocation
    Lan Du
    Wray Buntine
    Huidong Jin
    Changyou Chen
    [J]. Knowledge and Information Systems, 2012, 31 : 475 - 503
  • [8] Uncovering the Latent Structures of Crowd Labeling
    Tian, Tian
    Zhu, Jun
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I, 2015, 9077 : 392 - 404
  • [9] Topic Labeling Towards News Document Collection Based on Latent Dirichlet Allocation and Ontology
    Adhitama, Rifki
    Kusumaningrum, Retno
    Gernowo, Rahmat
    [J]. 2017 1ST INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2017, : 247 - 251
  • [10] Distributed Latent Dirichlet Allocation on Streams
    Guo, Yunyan
    Li, Jianzhong
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (01)