A Model Driven Approach to Imbalanced Data Sampling in Medical Decision Making

被引:16
|
作者
Yin, Hong-Li [1 ]
Leong, Tze-Yun [1 ]
机构
[1] Natl Univ Singapore, Med Comp Lab, Sch Comp, Comp 1,13 Comp Dr, Singapore 117417, Singapore
来源
MEDINFO 2010, PTS I AND II | 2010年 / 160卷
关键词
Random sampling; Synthetic Minority Over Sampling (SMOTE); Model driven sampling; Imbalanced data learning;
D O I
10.3233/978-1-60750-588-4-856
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Classification is an important medical decision support function that can be seriously affected by disproportionate class distribution in the training data. In medical decision making, the rate of misclassification and the cost of misclassifying a minority (positive) class as a majority (negative) class are especially high. In this paper, we propose a new model-driven sampling approach to balancing data samples. Most existing data sampling methods produce new data points based on local, deterministic information. Our approach extends the idea of generative sampling to produce new data points based on an induced probabilistic graphical model. We present the motivation and the design of the proposed algorithm, and compare it with two representative imbalanced data sampling approaches on four medical data sets varying in size, imbalance ratio, and dimension. The empirical study helped identify the challenges in imbalanced data problems in medicine, and highlighted the strengths and limitations of the relevant sampling approaches. Performance of the model driven approach is shown to be comparable with existing approaches; potential improvements could be achieved by incorporating domain knowledge.
引用
收藏
页码:856 / 860
页数:5
相关论文
共 50 条
  • [1] Multimedia medical data-driven decision making
    Chakraborty, Chinmay
    Divan, Mario Jose
    Mahmoudi, Said
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 41781 - 41783
  • [2] Multimedia medical data-driven decision making 
    Chinmay Chakraborty
    Mario José Diván
    Saïd Mahmoudi
    Multimedia Tools and Applications, 2022, 81 : 41781 - 41783
  • [3] A Semiotic Approach to Data in Medical Decision Making
    Kwiatkowska, Mila
    McMillan, Linda
    2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010), 2010,
  • [4] Multialternative Decision by Sampling: A Model of Decision Making Constrained by Process Data
    Noguchi, Takao
    Stewart, Neil
    PSYCHOLOGICAL REVIEW, 2018, 125 (04) : 512 - 544
  • [5] Data-Driven Analytics for Personalized Medical Decision Making
    Melnykova, Nataliia
    Shakhovska, Nataliya
    Gregus, Michal
    Melnykov, Volodymyr
    Zakharchuk, Mariana
    Vovk, Olena
    MATHEMATICS, 2020, 8 (08)
  • [6] Learning to improve medical decision making from imbalanced data without a priori cost
    Xiang Wan
    Jiming Liu
    William K Cheung
    Tiejun Tong
    BMC Medical Informatics and Decision Making, 14
  • [7] Learning to improve medical decision making from imbalanced data without a priori cost
    Wan, Xiang
    Liu, Jiming
    Cheung, William K.
    Tong, Tiejun
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2014, 14
  • [8] Intelligent Medical Decision-making Platform Driven by Medical Big Data
    Qu, Jia
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 60 - 60
  • [9] An Evolutionary Sampling Approach for Classification with Imbalanced Data
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    Coelho, Andre L. V.
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [10] Decision making for basketball clutch shots: A data driven approach
    Eppel, Yuval
    Kaspi, Mor
    Painsky, Amichai
    JOURNAL OF SPORTS ANALYTICS, 2023, 9 (03) : 245 - 260