Unsupervised Tree Boosting for Learning Probability Distributions

被引:0
|
作者
Awaya, Naoki [1 ]
Ma, Li [2 ]
机构
[1] Waseda Univ, Sch Polit Sci & Econ, Shinjuku City, Tokyo 1698050, Japan
[2] Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
关键词
generative models; normalizing flows; additive models; density estimation; ensemble methods; recursive partitioning; POLYA TREE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose an unsupervised tree boosting algorithm for inferring the underlying sampling distribution of an i.i.d. sample based on fitting additive tree ensembles in a manner analogous to supervised tree boosting. Integral to the algorithm is a new notion of "addition" on probability distributions that leads to a coherent notion of "residualization", i.e., subtracting a probability distribution from an observation to remove the distributional structure from the sampling distribution of the latter. We show that these notions arise naturally for univariate distributions through cumulative distribution function (CDF) transforms and compositions due to several "group-like" properties of univariate CDFs. While the traditional multivariate CDF does not preserve these properties, a new definition of multivariate CDF can restore these properties, thereby allowing the notions of "addition" and "residualization" to be formulated for multivariate settings as well. This then gives rise to the unsupervised boosting algorithm based on forward-stagewise fitting of an additive tree ensemble, which sequentially reduces the Kullback-Leibler divergence from the truth. The algorithm allows analytic evaluation of the fitted density and outputs a generative model that can be readily sampled from. We enhance the algorithm with scale-dependent shrinkage and a two-stage strategy that separately fits the marginals and the copula. The algorithm then performs competitively with state-of-the-art deep-learning approaches in multivariate density estimation on multiple benchmark data sets.
引用
收藏
页数:52
相关论文
共 50 条
  • [1] Unsupervised learning of distributions
    Reimann, P
    [J]. EUROPHYSICS LETTERS, 1997, 40 (03): : 251 - 256
  • [2] EmotionGAN: Unsupervised Domain Adaptation for Learning Discrete Probability Distributions of Image Emotions
    Zhao, Sicheng
    Zhao, Xin
    Ding, Guiguang
    Keutzer, Kurt
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1319 - 1327
  • [3] Learning discrete probability distributions with a multi-resolution binary tree
    Sanchis, F. A.
    Aznar, F.
    Sempere, M.
    Pujol, M.
    Rizo, R.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 472 - 479
  • [4] Boosting unsupervised competitive learning ensembles
    Corchado, Emilio
    Baruque, Bruno
    Yin, Hujun
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 1, PROCEEDINGS, 2007, 4668 : 339 - +
  • [5] LEARNING HIGH-DIMENSIONAL PROBABILITY DISTRIBUTIONS USING TREE TENSOR NETWORKS
    Grelier, Erwan
    Nouy, Anthony
    Lebrun, Regis
    [J]. INTERNATIONAL JOURNAL FOR UNCERTAINTY QUANTIFICATION, 2022, 12 (05) : 47 - 69
  • [6] Boosting Reinforcement Learning with Unsupervised Feature Extraction
    Hakenes, Simon
    Glasmachers, Tobias
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 555 - 566
  • [7] Tree convolution for probability distributions with unbounded support
    Davis, Ethan
    Jekel, David
    Wang, Zhichao
    [J]. ALEA-LATIN AMERICAN JOURNAL OF PROBABILITY AND MATHEMATICAL STATISTICS, 2021, 18 (02): : 1585 - 1623
  • [8] Tree boosting for learning EFT parameters
    Chatterjee, Suman
    Frohner, Nikolaus
    Lechner, Lukas
    Schoefbeck, Robert
    Schwarz, Dennis
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2022, 277
  • [9] Adapted Tree Boosting for Transfer Learning
    Fang, Wenjing
    Chen, Chaos Hao
    Song, Owen
    Wang, Li
    Thou, Jun
    Thu, Kenny Q.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 741 - 750
  • [10] Robust boosting classification models with local sets of probability distributions
    Utkin, Lev V.
    Zhuk, Yulia A.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 61 : 59 - 75