Partition of Interval-Valued Observations Using Regression

被引:2
|
作者
Liu, Fei [1 ]
Billard, L. [2 ]
机构
[1] Bank Amer, Charlotte, NC 28202 USA
[2] Univ Georgia, Dept Stat, Athens, GA 30602 USA
关键词
Clusters; k-means algorithm; k-regressions algorithm; Hausdorff distance; City-block distance; Center distance; Simulation methods; Real-data application; CLUSTERS; NUMBER;
D O I
10.1007/s00357-021-09394-5
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Both regression modeling and clustering methodologies have been extensively studied as separate techniques. There has been some activity in using regression-based algorithms to partition a data set into clusters for classical data; we propose one such algorithm to cluster interval-valued data. The new algorithm is based on the k-means algorithm of MacQueen (1967) and the dynamical partitioning method of Diday and Simon (1976), with the partitioning criteria being based on establishing regression models for each sub-cluster. This also depends on distance measures between the underlying regression models for each sub-cluster. Several types of simulated data sets are generated for several different data structures. The proposed k-regressions algorithm consistently out-performs the k-means algorithm. Elbow plots are used to identify the total number of clusters K in the partition. The new method is also applied to real data.
引用
收藏
页码:55 / 77
页数:23
相关论文
共 50 条
  • [1] Partition of Interval-Valued Observations Using Regression
    Fei Liu
    L. Billard
    [J]. Journal of Classification, 2022, 39 : 55 - 77
  • [2] Regression analysis for interval-valued data
    Billard, L
    Diday, E
    [J]. DATA ANALYSIS, CLASSIFICATION, AND RELATED METHODS, 2000, : 369 - 374
  • [3] Interval-valued fuzzy hypergraph and fuzzy partition
    Chen, SM
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1997, 27 (04): : 725 - 733
  • [4] Constrained Regression for Interval-Valued Data
    Gonzalez-Rivera, Gloria
    Lin, Wei
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2013, 31 (04) : 473 - 490
  • [5] Resistant Regression for Interval-Valued Data
    Renan, Jobson
    Silva, Jornandes Dias
    Galdino, Sergio
    [J]. 2013 1ST BRICS COUNTRIES CONGRESS ON COMPUTATIONAL INTELLIGENCE AND 11TH BRAZILIAN CONGRESS ON COMPUTATIONAL INTELLIGENCE (BRICS-CCI & CBIC), 2013, : 277 - 281
  • [6] Linear regression with interval-valued data
    Sun, Yan
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2016, 8 (01): : 54 - 60
  • [7] Quantile Regression of Interval-Valued Data
    Fagundes, Roberta A. A.
    de Souza, Renata M. C. R.
    Soares, Yanne M. G.
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2586 - 2591
  • [8] Clustering regression based on interval-valued fuzzy outputs and interval-valued fuzzy parameters
    Arefi, Mohsen
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 30 (03) : 1339 - 1351
  • [10] Interval-valued data regression using nonparametric additive models
    Changwon Lim
    [J]. Journal of the Korean Statistical Society, 2016, 45 : 358 - 370