Linear regression for numeric symbolic variables: a least squares approach based on Wasserstein Distance

被引:0
|
作者
Antonio Irpino
Rosanna Verde
机构
[1] Second University of Naples,Department of Political Sciences “J. Monnet”
关键词
Modal symbolic variables; Probability distribution function; Histogram data; Regression; Wasserstein distance; 62J05; 62G30; 46F10;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we present a new linear regression technique for distributional symbolic variables, i.e., variables whose realizations can be histograms, empirical distributions or empirical estimates of parametric distributions. Such data are known as numerical modal data according to the Symbolic Data Analysis definitions. In order to measure the error between the observed and the predicted distributions, the ℓ2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _2$$\end{document} Wasserstein distance is proposed. Some properties of such a metric are exploited to predict the modal response variable as a linear combination of the explanatory modal variables. Based on the metric, the model uses the quantile functions associated with the data and thus is subject to a positivity constraint of the estimated parameters. We propose solving the linear regression problem by starting from a particular decomposition of the squared distance. Therefore, we estimate the model parameters according to two separate models, one for the averages of the data and one for the centered distributions by a constrained least squares algorithm. Measures of goodness-of-fit are also proposed and discussed. The method is validated by two applications, one on simulated data and one on two real-world datasets.
引用
收藏
页码:81 / 106
页数:25
相关论文
共 50 条
  • [31] New Bounds on Compressive Linear Least Squares Regression
    Kaban, Ata
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 448 - 456
  • [32] Regularized least weighted squares estimator in linear regression
    Kalina, Jan
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [33] Partial least squares regression with compositional response variables and covariates
    Chen, Jiajia
    Zhang, Xiaoqin
    Hron, Karel
    JOURNAL OF APPLIED STATISTICS, 2021, 48 (16) : 3130 - 3149
  • [34] On the Equivalence of Linear Discriminant Analysis and Least Squares Regression
    Nie, Feiping
    Chen, Hong
    Xiang, Shiming
    Zhang, Changshui
    Yan, Shuicheng
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5710 - 5720
  • [35] Iterative Least Trimmed Squares for Mixed Linear Regression
    Shen, Yanyao
    Sanghavi, Sujay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [36] Fuzzy multiple linear least squares regression analysis
    Li, Yingfang
    He, Xingxing
    Liu, Xueqin
    FUZZY SETS AND SYSTEMS, 2023, 459 : 118 - 143
  • [37] Robustness Analysis for Least Squares Kernel Based Regression: an Optimization Approach
    Falck, Tillmann
    Suykens, Johan A. K.
    De Moor, Bart
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 6774 - 6779
  • [38] Functional Linear Regression Analysis Based on Partial Least Squares and Its Application
    Wang, Huwien
    Huang, Lele
    MULTIPLE FACETS OF PARTIAL LEAST SQUARES AND RELATED METHODS, 2016, 173 : 201 - 211
  • [39] THE FLEXIBLE LEAST-SQUARES APPROACH TO TIME-VARYING LINEAR-REGRESSION
    KALABA, R
    TESFATSION, L
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 1988, 12 (01): : 43 - 48
  • [40] Fuzzy least squares regression model based of weighted distance between fuzzy numbers
    Nasibov E.N.
    Automatic Control and Computer Sciences, 2007, 41 (1) : 10 - 17