Learning mixtures of structured distributions over discrete domains

被引:0
|
作者
Chan, Siu-On [1 ]
Diakonikolas, Ilias [2 ]
Servedio, Rocco A. [3 ]
Sun, Xiaorui [3 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[3] Columbia Univ, New York, NY 10027 USA
关键词
MAXIMUM-LIKELIHOOD-ESTIMATION; LOG-CONCAVE; DENSITY; PROBABILITY; MONOTONE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Let C be a class of probability distributions over the discrete domain [n] = {1, ..., n}: We show that if C satisfies a rather general condition - essentially, that each distribution in C can be well-approximated by a variable-width histogram with few bins - then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of k unknown distributions from C. We analyze several natural types of distributions over [n], including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems as described below. More precisely, Log-concave distributions: We learn any mixture of k log-concave distributions over [n] using k.(O) over tilde (1/epsilon(4)) samples (independent of n) and running in time (O) over tilde (k log(n)/epsilon(4)) bit-operations (note that reading a single sample from [n] takes Theta (log n) bit operations). For the special case k = 1 we give an efficient algorithm using (O) over tilde (1/epsilon(3)) samples; this generalizes the main result of [DDS12b] from the class of Poisson Binomial distributions to the much broader class of all log-concave distributions. Our upper bounds are not far from optimal since any algorithm for this learning problem requires Omega (k/epsilon(5/2)) samples. Monotone hazard rate (MHR) distributions: We learn any mixture of k MHR distributions over [n] using O (k log(n/epsilon)/epsilon(4)) samples and running in time (O) over tilde (k log(2) (n)/epsilon(4)) bit-operations. Any algorithm for this learning problem must use Omega (k log(n)/epsilon(3)) samples. Unimodal distributions: We give an algorithm that learns any mixture of k unimodal distributions over [n] using O (k log(n)/epsilon(4)) samples and running in time (O) over tilde (k log(2) (n)/epsilon(4)) bit-operations. Any algorithm for this problem must use Omega (k log(n)/epsilon(3)) samples.
引用
收藏
页码:1380 / 1394
页数:15
相关论文
共 50 条
  • [1] Learning mixtures of product distributions over discrete domains
    Feldman, J
    O'Donnell, R
    Servedio, RA
    [J]. 46TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2005, : 501 - 510
  • [2] Learning mixtures of product distributions over discrete domains
    Feldman, Jon
    O'Donnell, Ryan
    Servedio, Rocco A.
    [J]. SIAM JOURNAL ON COMPUTING, 2008, 37 (05) : 1536 - 1564
  • [3] Differentially Private Learning of Structured Discrete Distributions
    Diakonikolas, Ilias
    Hardt, Moritz
    Schmidt, Ludwig
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [4] Learning Arbitrary Statistical Mixtures of Discrete Distributions
    Li, Jian
    Rabani, Yuval
    Schulman, Leonard J.
    Swamy, Chaitanya
    [J]. STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 743 - 752
  • [5] PAC learning of probability distributions over a discrete domain
    Magnoni, L
    Mirolli, M
    Montagna, F
    Simi, G
    [J]. THEORETICAL COMPUTER SCIENCE, 2003, 299 (1-3) : 37 - 63
  • [6] MIXTURES OF SOME DISCRETE DISTRIBUTIONS
    GUPTA, RC
    [J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1974, 8 (02) : 83 - 92
  • [7] Testing Mixtures of Discrete Distributions
    Aliakbarpour, Maryam
    Kumar, Ravi
    Rubinfeld, Ronitt
    [J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [8] ERLANG MIXTURES OF SOME DISCRETE DISTRIBUTIONS
    Roy, M. K.
    Haque, M. E.
    Roy, D. C.
    [J]. PAKISTAN JOURNAL OF STATISTICS, 2008, 24 (01): : 45 - 56
  • [9] On spectral learning of mixtures of distributions
    Achlioptas, D
    McSherry, R
    [J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 458 - 469
  • [10] A REPRESENTATION FOR DISCRETE-DISTRIBUTIONS BY EQUIPROBABLE MIXTURES
    PETERSON, AV
    KRONMAL, RA
    [J]. JOURNAL OF APPLIED PROBABILITY, 1980, 17 (01) : 102 - 111