Performance of asymmetric links and correction methods for imbalanced data in binary regression

被引:10
|
作者
Huayanay, Alex de la Cruz [1 ]
Bazan, Jorge L. [2 ]
Cancho, Vicente G. [2 ]
Dey, Dipak K. [3 ]
机构
[1] USP UFSCar, Interinst Grad Stat, Sao Carlos, SP, Brazil
[2] Univ Sao Paulo, Dept Appl Math & Stat, Sao Carlos, SP, Brazil
[3] Univ Connecticut, Dept Stat, Mansfield, CT USA
基金
巴西圣保罗研究基金会;
关键词
Asymmetric link; binary regression; imbalanced data; predictive evaluation; quantile residuals; similarity measures; CROSS-VALIDATION; MODEL; PROBIT;
D O I
10.1080/00949655.2019.1593984
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In binary regression, imbalanced data result from the presence of values equal to zero (or one) in a proportion that is significantly greater than the corresponding real values of one (or zero). In this work, we evaluate two methods developed to deal with imbalanced data and compare them to the use of asymmetric links. The results based on simulation study show, that correction methods do not adequately correct bias in the estimation of regression coefficients and that the models with power links and reverse power considered produce better results for certain types of imbalanced data. Additionally, we present an application for imbalanced data, identifying the best model among the various ones proposed. The parameters are estimated using a Bayesian approach, considering the Hamiltonian Monte-Carlo method, utilizing the No-U-Turn Sampler algorithm and the comparisons of models were developed using different criteria for model comparison, predictive evaluation and quantile residuals.
引用
收藏
页码:1694 / 1714
页数:21
相关论文
共 50 条
  • [21] Online Asymmetric Active Learning with Imbalanced Data
    Zhang, Xiaoxuan
    Yang, Tianbao
    Srinivasan, Padmini
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 2055 - 2064
  • [22] Classification of Imbalanced Data Represented as Binary Features
    Mahmudah, Kunti Robiatul
    Indriani, Fatma
    Takemori-Sakai, Yukiko
    Iwata, Yasunori
    Wada, Takashi
    Satou, Kenji
    APPLIED SCIENCES-BASEL, 2021, 11 (17):
  • [23] An automated approach for binary classification on imbalanced data
    Vieira, Pedro Marques
    Rodrigues, Fatima
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (05) : 2747 - 2767
  • [24] An automated approach for binary classification on imbalanced data
    Pedro Marques Vieira
    Fátima Rodrigues
    Knowledge and Information Systems, 2024, 66 : 2747 - 2767
  • [25] A Hybrid Approach for Binary Classification of Imbalanced Data
    Tsai, Hsinhan
    Yang, Ta-Wei
    Wong, Wai-Man
    Kao, Han-Yi
    Chou, Cheng-Fu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2024, 23 (03)
  • [26] Binary classification for imbalanced data using data conformity mechanism
    Zheng, Jian
    Ren, Shumiao
    Zhang, Jingyue
    Wang, Shiyan
    Li, Lin
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [27] Framework for Skew-Probit Links in Binary Regression
    Bazan, Jorge L.
    Bolfarine, Heleno
    Branco, Marcia D.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2010, 39 (04) : 678 - 697
  • [28] ACID: Association Correction for Imbalanced Data in GWAS
    Bao, Feng
    Deng, Yue
    Dai, Qionghai
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (01) : 316 - 322
  • [29] Review of imbalanced data classification methods
    Li Y.-X.
    Chai Y.
    Hu Y.-Q.
    Yin H.-P.
    Kongzhi yu Juece/Control and Decision, 2019, 34 (04): : 673 - 688
  • [30] BINARY REGRESSION WITH UNREPLICATED DATA
    ORCHARD, TJ
    BIOMETRICS, 1976, 32 (04) : 935 - 938