Zero-Inflated Exponential Family Embeddings

被引:0
|
作者
Liu, Li-Ping [1 ,2 ]
Blei, David M. [1 ]
机构
[1] Columbia Univ, 500 W 120th St, New York, NY 10027 USA
[2] Tufts Univ, 161 Coll Ave, Medford, MA 02155 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word embeddings are a widely-used tool to analyze language, and exponential family embeddings (Rudolph et al., 2016) generalize the technique to other types of data. One challenge to fitting embedding methods is sparse data, such as a document/term matrix that contains many zeros. To address this issue, practitioners typically downweight or subsample the zeros, thus focusing learning on the non-zero entries. In this paper, we develop zero-inflated embeddings, a new embedding method that is designed to learn from sparse observations. In a zero-inflated embedding (ZIE), a zero in the data can come from an interaction to other data (i.e., an embedding) or from a separate process by which many observations are equal to zero (i.e. a probability mass at zero). Fitting a ZIE naturally downweights the zeros and dampens their influence on the model. Across many types of data-language, movie ratings, shopping histories, and bird watching logs-we found that zero-inflated embeddings provide improved predictive performance over standard approaches and find better vector representation of items.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] A Bayesian approach to zero-inflated data in extremes
    Quadros Gramosa, Alexandre Henrique
    do Nascimento, Fernando Ferraz
    Castro Morales, Fidel Ernesto
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2020, 49 (17) : 4150 - 4161
  • [42] Droplet scRNA-seq is not zero-inflated
    Valentine Svensson
    Nature Biotechnology, 2020, 38 : 147 - 150
  • [43] Testing overdispersion in the zero-inflated Poisson model
    Yang, Zhao
    Hardin, James W.
    Addy, Cheryl L.
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (09) : 3340 - 3353
  • [44] Zero-inflated Poisson regression mixture model
    Lim, Hwa Kyung
    Li, Wai Keung
    Yu, Philip L. H.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 151 - 158
  • [45] Zero-inflated Poisson model with group data
    Yang, Jun
    Zhang, Xin
    ADVANCED MATERIALS DESIGN AND MECHANICS, 2012, 569 : 627 - 631
  • [46] Zero-Inflated Beta Distribution Regression Modeling
    Becky Tang
    Henry A. Frye
    Alan E. Gelfand
    John A. Silander
    Journal of Agricultural, Biological and Environmental Statistics, 2023, 28 : 117 - 137
  • [47] Robust Estimation for Zero-Inflated Poisson Regression
    Hall, Daniel B.
    Shen, Jing
    SCANDINAVIAN JOURNAL OF STATISTICS, 2010, 37 (02) : 237 - 252
  • [48] Modelling correlated zero-inflated count data
    Dobbie, MJ
    Welsh, AH
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2001, 43 (04) : 431 - 444
  • [49] CONSISTENT ESTIMATION OF ZERO-INFLATED COUNT MODELS
    Staub, Kevin E.
    Winkelmann, Rainer
    HEALTH ECONOMICS, 2013, 22 (06) : 673 - 686
  • [50] Control Charts for Zero-Inflated Binomial Models
    Yawsaeng, Bunpen
    Mayureesawan, Tidadeaw
    THAILAND STATISTICIAN, 2012, 10 (01): : 107 - 120