Accelerating distributed Expectation-Maximization algorithms with frequent updates

被引:8
|
作者
Yin, Jiangtao [1 ]
Zhang, Yanfeng [2 ]
Gao, Lixin [1 ]
机构
[1] Univ Massachusetts Amherst, 151 Holdsworth Way, Amherst, MA 01003 USA
[2] Northeastern Univ, 11 Wenhua Rd, Shenyang 110819, Liaoning, Peoples R China
基金
美国国家科学基金会;
关键词
Expectation-Maximization; Frequent updates; Concurrent updates; Distributed framework; Clustering; Topic modeling; MAXIMUM-LIKELIHOOD; EM; FRAMEWORK; HADOOP;
D O I
10.1016/j.jpdc.2017.07.005
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Expectation-Maximization (EM) is a popular approach for parameter estimation in many applications, such as image understanding, document classification, and genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment for handling massive data sets. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through two categories of EM applications, clustering and topic modeling. These applications include k-means clustering, fuzzy c-means clustering, parameter estimation for the Gaussian Mixture Model, and variational inference for Latent Dirichlet Allocation. We extensively evaluate our framework on both a cluster of local machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can converge much faster than those implementations with traditional concurrent updates. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:65 / 75
页数:11
相关论文
共 50 条
  • [1] Accelerating Expectation-Maximization Algorithms with Frequent Updates
    Yin, Jiangtao
    Zhang, Yanfeng
    Gao, Lixin
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 275 - 283
  • [2] Message passing expectation-maximization algorithms
    O'Sullivan, Joseph A.
    [J]. 2005 IEEE/SP 13th Workshop on Statistical Signal Processing (SSP), Vols 1 and 2, 2005, : 781 - 786
  • [3] Accelerated distributed expectation-maximization algorithms for the parameter estimation in multivariate Gaussian mixture models
    Guo, Guangbao
    Wang, Qian
    Allison, James
    Qian, Guoqi
    [J]. Applied Mathematical Modelling, 2025, 137
  • [4] Expectation-maximization algorithms for inference in Dirichlet processes mixture
    Kimura, T.
    Tokuda, T.
    Nakada, Y.
    Nokajima, T.
    Matsumoto, T.
    Doucet, A.
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2013, 16 (01) : 55 - 67
  • [5] Two Expectation-Maximization algorithms for Boolean Factor Analysis
    Frolov, Alexander A.
    Husek, Dusan
    Polyakov, Pavel Y.
    [J]. NEUROCOMPUTING, 2014, 130 : 83 - 97
  • [6] Expectation-maximization algorithms for inference in Dirichlet processes mixture
    T. Kimura
    T. Tokuda
    Y. Nakada
    T. Nokajima
    T. Matsumoto
    A. Doucet
    [J]. Pattern Analysis and Applications, 2013, 16 : 55 - 67
  • [7] The expectation-maximization algorithm
    Moon, TK
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 1996, 13 (06) : 47 - 60
  • [8] CONVERGENCE IN NORM FOR ALTERNATING EXPECTATION-MAXIMIZATION (EM) TYPE ALGORITHMS
    HERO, AO
    FESSLER, JA
    [J]. STATISTICA SINICA, 1995, 5 (01) : 41 - 54
  • [9] Distributed online expectation-maximization algorithm for Poisson mixture model
    Wang, Qian
    Guo, Guangbao
    Qian, Guoqi
    Jiang, Xuejun
    [J]. APPLIED MATHEMATICAL MODELLING, 2023, 124 : 734 - 748
  • [10] SPEAKER LOCALIZATION AND SEPARATION USING INCREMENTAL DISTRIBUTED EXPECTATION-MAXIMIZATION
    Dorfan, Yuval
    Cherkassky, Dani
    Gannot, Sharon
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1256 - 1260