Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values

被引:18
|
作者
Antipov, Evgeny A. [1 ]
Pokryshevskaya, Elena B. [1 ]
机构
[1] Natl Res Univ, Higher Sch Econ, Kantemirovskaya St 3, St Petersburg 194100, Russia
基金
俄罗斯科学基金会;
关键词
Sales forecasting; Shapley value; Interpretable machine learning; Random forest; Gradient Boosting Machines; Elastic net; BIG DATA; SALES; PROMOTION; ANALYTICS; RETAILER; CATEGORY; BRAND; PRICE;
D O I
10.1057/s41272-020-00236-4
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
Forecasting demand and understanding sales drivers are one of the most important tasks in retail analytics. However, traditionally, linear models and/or models with a small number of predictors have been predominantly used in sales modeling. Taking into account that real-world demand is naturally determined by complex substitution and complementation patterns among a large number of interrelated SKUs, nonlinear effects of prices, promotions, seasonality, as well as many other factors, their lagged values, and interactions, a realistic model has to be able to account for all that. We propose a conceptual model for sales modeling based on standard POS data available to any retailer and generate almost 500 potentially useful predictors of a focal SKU's sales accordingly. In our comparison of three classes of models, Gradient Boosting Machines outperformed Random Forests and Elastic nets. By using interpretable machine learning methods, we came up with actionable insights related to the importance of various groups of predictors from the conceptual model, as well as demonstrated how helpful it can be for marketing managers to decompose predictions into the effects of individual regressors by using an approximation of Shapley values for feature attribution.
引用
收藏
页码:355 / 364
页数:10
相关论文
共 50 条
  • [1] Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values
    Evgeny A. Antipov
    Elena B. Pokryshevskaya
    [J]. Journal of Revenue and Pricing Management, 2020, 19 : 355 - 364
  • [2] Interpretable machine learning with an ensemble of gradient boosting machines
    Konstantinov, Andrei, V
    Utkin, Lev, V
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 222
  • [3] Dealing with High Dimensional Sentiment Data Using Gradient Boosting Machines
    Athanasiou, Vasileios
    Maragoudakis, Manolis
    [J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2016, 2016, 475 : 481 - 489
  • [4] INTERPRETABLE MACHINE LEARNING OF HIGH-DIMENSIONAL AGING HEALTH TRAJECTORIES
    Farrell, Spencer
    Mitnitski, Arnold
    Rockwood, Kenneth
    Rutenberg, Andrew
    [J]. INNOVATION IN AGING, 2021, 5 : 672 - 672
  • [5] Interpretable machine learning for high-dimensional trajectories of aging health
    Farrell, Spencer
    Mitnitski, Arnold
    Rockwood, Kenneth
    Rutenberg, Andrew
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (01)
  • [6] Handling high-dimensional data with missing values by modern machine learning techniques
    Chen, Sixia
    Xu, Chao
    [J]. JOURNAL OF APPLIED STATISTICS, 2023, 50 (03) : 786 - 804
  • [7] Data analysis with Shapley values for automatic subject selection in Alzheimer’s disease data sets using interpretable machine learning
    Louise Bloch
    Christoph M. Friedrich
    [J]. Alzheimer's Research & Therapy, 13
  • [8] Data analysis with Shapley values for automatic subject selection in Alzheimer's disease data sets using interpretable machine learning
    Bloch, Louise
    Friedrich, Christoph M.
    [J]. ALZHEIMERS RESEARCH & THERAPY, 2021, 13 (01)
  • [9] Bayesian evolutionary hypernetworks for interpretable learning from high-dimensional data
    Kim, Soo-Jin
    Ha, Jung-Woo
    Kim, Heebal
    Zhang, Byoung-Tak
    [J]. APPLIED SOFT COMPUTING, 2019, 81
  • [10] High-dimensional data monitoring using support machines
    Maboudou-Tchao, Edgard M.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 1927 - 1942