Discriminant Analysis of Interval Data: An Assessment of Parametric and Distance-Based Approaches

被引:20
|
作者
Silva, A. Pedro Duarte [1 ,2 ]
Brito, Paula [3 ,4 ]
机构
[1] Univ Catolica Portuguesa, Fac Econ & Gestao, Porto, Portugal
[2] Univ Catolica Portuguesa, CEGE, Porto, Portugal
[3] Univ Porto, Fac Econ, P-4100 Porto, Portugal
[4] Univ Porto, LIAAD INESC TEC, P-4100 Porto, Portugal
关键词
Discriminant analysis; Interval data; Parametric modelling of interval data; Symbolic Data Analysis; STATISTICS;
D O I
10.1007/s00357-015-9189-8
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Building on probabilistic models for interval-valued variables, parametric classification rules, based on Normal or Skew-Normal distributions, are derived for interval data. The performance of such rules is then compared with distancebased methods previously investigated. The results show that Gaussian parametric approaches outperform Skew-Normal parametric and distance-based ones in most conditions analyzed. In particular, with heterocedastic data a quadratic Gaussian rule always performs best. Moreover, restricted cases of the variance-covariance matrix lead to parsimonious rules which for small training samples in heterocedastic problems can outperform unrestricted quadratic rules, even in some cases where the model assumed by these rules is not true. These restrictions take into account the particular nature of interval data, where observations are defined by both MidPoints and Ranges, which may or may not be correlated. Under homocedastic conditions linear Gaussian rules are often the best rules, but distance-based methods may perform better in very specific conditions.
引用
收藏
页码:516 / 541
页数:26
相关论文
共 50 条
  • [31] Distance-based Clustering of Functional Data with Derivative Principal Component Analysis
    Yu, Ping
    Shi, Gongming
    Wang, Chunjie
    Song, Xinyuan
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024,
  • [32] Clustering and Analysis of GPS Trajectory Data Using Distance-Based Features
    Koh, Zann
    Zhou, Yuren
    Lau, Billy Pik Lik
    Liu, Ran
    Chong, Keng Hua
    Yuen, Chau
    IEEE ACCESS, 2022, 10 (125387-125399): : 125387 - 125399
  • [33] Weighted distance-based trees for ranking data
    Antonella Plaia
    Mariangela Sciandra
    Advances in Data Analysis and Classification, 2019, 13 : 427 - 444
  • [34] Distance-based tree models for ranking data
    Lee, Paul H.
    Yu, Philip L. H.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (06) : 1672 - 1682
  • [35] Kernelized inner product-based discriminant analysis for interval data
    D. C. F. Queiroz
    R. M. C. R. Souza
    F. J. A. Cysneiros
    M. C. Araujo
    Pattern Analysis and Applications, 2018, 21 : 731 - 740
  • [36] Kernelized inner product-based discriminant analysis for interval data
    Queiroz, D. C. F.
    Souza, R. M. C. R.
    Cysneiros, F. J. A.
    Araujo, M. C.
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (03) : 731 - 740
  • [37] Distance-based Outlier Detection in Data Streams
    Tran, Luan
    Fan, Liyue
    Shahabi, Cyrus
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (12): : 1089 - 1100
  • [38] Distance-based outlier detection on uncertain data
    Yu, Hao
    Wang, Bin
    Xiao, Gang
    Yang, Xiaochun
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (03): : 474 - 484
  • [39] Weighted distance-based trees for ranking data
    Plaia, Antonella
    Sciandra, Mariangela
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (02) : 427 - 444
  • [40] Mixtures of distance-based models for ranking data
    Murphy, TB
    Martin, D
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 41 (3-4) : 645 - 655