Learning mixtures of structured distributions over discrete domains

被引：0

作者：

Chan, Siu-On ^{[1
]}

Diakonikolas, Ilias ^{[2
]}

Servedio, Rocco A. ^{[3
]}

Sun, Xiaorui ^{[3
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland

[3] Columbia Univ, New York, NY 10027 USA

来源：

PROCEEDINGS OF THE TWENTY-FOURTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA 2013) | 2013年

关键词：

MAXIMUM-LIKELIHOOD-ESTIMATION; LOG-CONCAVE; DENSITY; PROBABILITY; MONOTONE;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Let C be a class of probability distributions over the discrete domain [n] = {1, ..., n}: We show that if C satisfies a rather general condition - essentially, that each distribution in C can be well-approximated by a variable-width histogram with few bins - then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of k unknown distributions from C. We analyze several natural types of distributions over [n], including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems as described below. More precisely, Log-concave distributions: We learn any mixture of k log-concave distributions over [n] using k.(O) over tilde (1/epsilon(4)) samples (independent of n) and running in time (O) over tilde (k log(n)/epsilon(4)) bit-operations (note that reading a single sample from [n] takes Theta (log n) bit operations). For the special case k = 1 we give an efficient algorithm using (O) over tilde (1/epsilon(3)) samples; this generalizes the main result of [DDS12b] from the class of Poisson Binomial distributions to the much broader class of all log-concave distributions. Our upper bounds are not far from optimal since any algorithm for this learning problem requires Omega (k/epsilon(5/2)) samples. Monotone hazard rate (MHR) distributions: We learn any mixture of k MHR distributions over [n] using O (k log(n/epsilon)/epsilon(4)) samples and running in time (O) over tilde (k log(2) (n)/epsilon(4)) bit-operations. Any algorithm for this learning problem must use Omega (k log(n)/epsilon(3)) samples. Unimodal distributions: We give an algorithm that learns any mixture of k unimodal distributions over [n] using O (k log(n)/epsilon(4)) samples and running in time (O) over tilde (k log(2) (n)/epsilon(4)) bit-operations. Any algorithm for this problem must use Omega (k log(n)/epsilon(3)) samples.

引用

页码：1380 / 1394

页数：15

共 50 条

[1] Learning mixtures of product distributions over discrete domains
Feldman, J
O'Donnell, R
Servedio, RA
[J]. 46TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2005, : 501 - 510
[2] Learning mixtures of product distributions over discrete domains
Feldman, Jon
O'Donnell, Ryan
Servedio, Rocco A.
[J]. SIAM JOURNAL ON COMPUTING, 2008, 37 (05) : 1536 - 1564
[3] Differentially Private Learning of Structured Discrete Distributions
Diakonikolas, Ilias
Hardt, Moritz
Schmidt, Ludwig
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[4] Learning Arbitrary Statistical Mixtures of Discrete Distributions
Li, Jian
Rabani, Yuval
Schulman, Leonard J.
Swamy, Chaitanya
[J]. STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 743 - 752
[5] PAC learning of probability distributions over a discrete domain
Magnoni, L
Mirolli, M
Montagna, F
Simi, G
[J]. THEORETICAL COMPUTER SCIENCE, 2003, 299 (1-3) : 37 - 63
[6] MIXTURES OF SOME DISCRETE DISTRIBUTIONS
GUPTA, RC
[J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1974, 8 (02) : 83 - 92
[7] Testing Mixtures of Discrete Distributions
Aliakbarpour, Maryam
Kumar, Ravi
Rubinfeld, Ronitt
[J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[8] ERLANG MIXTURES OF SOME DISCRETE DISTRIBUTIONS
Roy, M. K.
Haque, M. E.
Roy, D. C.
[J]. PAKISTAN JOURNAL OF STATISTICS, 2008, 24 (01): : 45 - 56
[9] On spectral learning of mixtures of distributions
Achlioptas, D
McSherry, R
[J]. LEARNING THEORY, PROCEEDINGS, 2005, 3559 : 458 - 469
[10] A REPRESENTATION FOR DISCRETE-DISTRIBUTIONS BY EQUIPROBABLE MIXTURES
PETERSON, AV
KRONMAL, RA
[J]. JOURNAL OF APPLIED PROBABILITY, 1980, 17 (01) : 102 - 111

← 1 2 3 4 5 →