Testing Mixtures of Discrete Distributions

被引:0
|
作者
Aliakbarpour, Maryam [1 ]
Kumar, Ravi [2 ]
Rubinfeld, Ronitt [1 ,3 ]
机构
[1] MIT, CSAIL, Cambridge, MA 02139 USA
[2] Google, Mountain View, CA USA
[3] Tel Aviv Univ, Tel Aviv, Israel
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size n, distinguishing the uniform distribution from distributions that are far from uniform in l(1)-distance uses only O(root n) samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is epsilon-close to uniform from the case where the distribution is (1 - epsilon)-far from uniform. The latter task requires nearly linear in n samples (Valiant, 2008; Valiant and Valiant, 2017a). In this work, we present a noise model that on one hand is more tractable for the testing problem, and on the other hand represents a rich class of noise families. In our model, the noisy distribution is a mixture of the original distribution and noise, where the latter is known to the tester either explicitly or via sample access; the form of the noise is also known a priori. Focusing on the identity and closeness testing problems leads to the following mixture testing question: Given samples of distributions p, q(1), q(2), can we test if p is a mixture of q(1) and q(2)? We consider this general question in various scenarios that differ in terms of how the tester can access the distributions, and show that indeed this problem is more tractable. Our results show that the sample complexity of our testers are exactly the same as for the classical non-mixture case.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] MIXTURES OF SOME DISCRETE DISTRIBUTIONS
    GUPTA, RC
    [J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1974, 8 (02) : 83 - 92
  • [2] ERLANG MIXTURES OF SOME DISCRETE DISTRIBUTIONS
    Roy, M. K.
    Haque, M. E.
    Roy, D. C.
    [J]. PAKISTAN JOURNAL OF STATISTICS, 2008, 24 (01): : 45 - 56
  • [3] Testing Closeness of Discrete Distributions
    Batu, Tugkan
    Fortnow, Lance
    Rubinfeld, Ronitt
    Smith, Warren D.
    White, Patrick
    [J]. JOURNAL OF THE ACM, 2013, 60 (01)
  • [4] Testing homogeneity in discrete mixtures
    Charnigo, Richard
    Sun, Rayang
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2008, 138 (05) : 1368 - 1388
  • [5] Testing Conditional Independence of Discrete Distributions
    Canonne, Clement L.
    Diakonikolas, Ilias
    Kane, Daniel M.
    Stewart, Alistair
    [J]. 2018 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2018,
  • [6] Learning Arbitrary Statistical Mixtures of Discrete Distributions
    Li, Jian
    Rabani, Yuval
    Schulman, Leonard J.
    Swamy, Chaitanya
    [J]. STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 743 - 752
  • [7] Testing Shape Restrictions of Discrete Distributions
    Canonne, Clement L.
    Diakonikolas, Ilias
    Gouleakis, Themis
    Rubinfeld, Ronitt
    [J]. THEORY OF COMPUTING SYSTEMS, 2018, 62 (01) : 4 - 62
  • [8] A REPRESENTATION FOR DISCRETE-DISTRIBUTIONS BY EQUIPROBABLE MIXTURES
    PETERSON, AV
    KRONMAL, RA
    [J]. JOURNAL OF APPLIED PROBABILITY, 1980, 17 (01) : 102 - 111
  • [10] Testing Shape Restrictions of Discrete Distributions
    Clément L. Canonne
    Ilias Diakonikolas
    Themis Gouleakis
    Ronitt Rubinfeld
    [J]. Theory of Computing Systems, 2018, 62 : 4 - 62