Communication-efficient algorithms for parallel latent Dirichlet allocation

被引:3
|
作者
Yan, Jian-Feng [1 ]
Zeng, Jia [1 ]
Gao, Yang [1 ]
Liu, Zhi-Qiang [2 ]
机构
[1] Suzhou Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[2] City Univ Hong Kong, Sch Creat Media, Hong Kong, Hong Kong, Peoples R China
关键词
Latent Dirichlet allocation; Parallel learning; Zipf's law; Belief propagation; Gibbs sampling;
D O I
10.1007/s00500-014-1376-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Latent Dirichlet allocation (LDA) is a popular topic modeling method which has found many multimedia applications, such as motion analysis and image categorization. Communication cost is one of the main bottlenecks for large-scale parallel learning of LDA. To reduce communication cost, we introduce Zipf's law and propose novel parallel LDA algorithms that communicate only partial important information at each learning iteration. The proposed algorithms are much more efficient than the current state-of-theart algorithms in both communication and computation costs. Extensive experiments on large-scale data sets demonstrate that our algorithms can greatly reduce communication and computation costs to achieve a better scalability.
引用
收藏
页码:3 / 11
页数:9
相关论文
共 50 条
  • [31] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 601 - 608
  • [32] Communication-efficient deterministic parallel algorithms for planar point location and 2d Voronoi Diagram
    Diallo, M
    Ferreira, A
    Rau-Chaplin, A
    STACS 98 - 15TH ANNUAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE, 1998, 1373 : 399 - 409
  • [33] EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation
    Zhou, Qi
    Chen, Haipeng
    Zheng, Yitao
    Wang, Zhen
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14602 - 14611
  • [34] PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications
    Wang, Yi
    Bai, Hongjie
    Stanton, Matt
    Chen, Wen-Yen
    Chang, Edward Y.
    ALGORITHMIC ASPECTS IN INFORMATION AND MANAGEMENT, PROCEEDINGS, 2009, 5564 : 301 - +
  • [35] Cluster-based architecture for parallel learning of latent dirichlet allocation
    Tu, Xionggang
    Chen, Jun
    Yang, Lu
    Yan, Jianfeng
    Journal of Computational Information Systems, 2015, 11 (02): : 399 - 407
  • [36] Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis
    Garber, Dan
    Shamir, Ohad
    Srebro, Nathan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [37] Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms
    Zhang, Yihan
    Qiu, Meikang
    Gao, Hongchang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4602 - 4610
  • [38] Communication-efficient bitonic sort on a distributed memory parallel computer
    Kim, YC
    Jeon, M
    Kim, D
    Sohn, A
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, : 165 - 170
  • [39] HYPERNET - A COMMUNICATION-EFFICIENT ARCHITECTURE FOR CONSTRUCTING MASSIVELY PARALLEL COMPUTERS
    HWANG, K
    GHOSH, J
    IEEE TRANSACTIONS ON COMPUTERS, 1987, 36 (12) : 1450 - 1466
  • [40] Communication-efficient decentralised algorithms for seismic tomography with sensor networks
    Zhao, Liang
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2020, 35 (05) : 550 - 570