A Hybrid Regression Model for Mixed Numerical and Categorical Data

被引:0
|
作者
Alghanmi, Nouf [1 ]
Zeng, Xiao-Jun [1 ]
机构
[1] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England
关键词
Decision tree; Regression; Mixed data; Hybrid model; SELECTION;
D O I
10.1007/978-3-030-29933-0_31
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is noticeable in different heterogeneity types that complexity is inherent in heterogeneous data, and regression analysis methods are well defined and exhibit high-accuracy performance with numeric data. However, real-world problems contain non-numerical variables. There are two main approaches to handling mixed-type data sets in regression analyses. The first approach is unifying data types for all the variables (such as continuous numerical data) and then applying the regression analysis. However, this approach degrades the data quality, as some original data types are converted to other types in the learning stage. The second approach is to apply some similarity measurements, which can be highly complex in some situations. To overcome these limitations, we propose a tree-based regression model to effectively handle the mixed-type data sets without using a dummy code or a similarity measurement.
引用
收藏
页码:369 / 376
页数:8
相关论文
共 50 条
  • [1] A semi-supervised regression model for mixed numerical and categorical variables
    Ng, Michael K.
    Chan, Elaine Y.
    So, Meko M. C.
    Ching, Wai-Ki
    PATTERN RECOGNITION, 2007, 40 (06) : 1745 - 1752
  • [2] A hybrid decision tree algorithm for mixed numeric and categorical data in regression analysis
    Kim, Kyoungok
    Hong, Jung-Sik
    PATTERN RECOGNITION LETTERS, 2017, 98 : 39 - 45
  • [3] Structured additive regression for categorical space-time data: A mixed model approach
    Kneib, T
    Fahrmeir, L
    BIOMETRICS, 2006, 62 (01) : 109 - 118
  • [4] Clustering mixed numerical and categorical data with missing values
    Dinh, Duy-Tai
    Huynh, Van-Nam
    Sriboonchitta, Songsak
    INFORMATION SCIENCES, 2021, 571 : 418 - 442
  • [5] Unsupervised pattern recognition of mixed data structures with numerical and categorical features using a mixture regression modelling framework
    Ng, Shu-Kay
    Tawiah, Richard
    McLachlan, Geoffrey J.
    PATTERN RECOGNITION, 2019, 88 : 261 - 271
  • [6] A study on a fuzzy clustering for mixed numerical and categorical incomplete data
    Furukawa, Takashi
    Ohnishi, Shin-ichi
    Yamanoi, Takahiro
    2013 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY 2013), 2013, : 425 - 428
  • [7] AN HYBRID APPROACH TO FEATURE SELECTION FOR MIXED CATEGORICAL AND CONTINUOUS DATA
    Doquire, Gauthier
    Verleysen, Michel
    KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 394 - 401
  • [8] A MIXED-EFFECTS MODEL FOR CATEGORICAL-DATA
    BEITLER, PJ
    LANDIS, JR
    BIOMETRICS, 1985, 41 (04) : 991 - 1000
  • [9] An Advanced Hybrid Logistic Regression Model for Static and Dynamic Mixed Data Classification
    Quan, Mingxue
    IEEE ACCESS, 2022, 10 : 73623 - 73634
  • [10] Regularized regression for categorical data
    Tutz, Gerhard
    Gertheiss, Jan
    STATISTICAL MODELLING, 2016, 16 (03) : 161 - 200