A novel data-driven robust framework based on machine learning and knowledge graph for disease classification

被引:36
|
作者
Lei, Zhenfeng [1 ]
Sun, Yuan [2 ]
Nanehkaran, Y. A. [1 ]
Yang, Shuangyuan [1 ]
Islam, Md Saiful [2 ]
Lei, Huiqing [3 ]
Zhang, Defu [1 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361005, Fujian, Peoples R China
[2] Univ Adelaide, Sch Elect & Elect Engn, Adelaide, SA 5005, Australia
[3] Zhengzhou Univ, Affiliated Hosp 1, Dept Breast Surg, Zhengzhou 450000, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
Disease classification; NCDs; Data fusion; Machine learning; Knowledge graph;
D O I
10.1016/j.future.2019.08.030
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As Noncommunicable Diseases (NCDs) are affected or controlled by diverse factors such as age, regionalism, timeliness or seasonality, they are always challenging to be treated accurately, which has impacted on daily life and work of patients. Unfortunately, although a number of researchers have already made some achievements (including clinical or even computer-based) on certain diseases, current situation is eager to be improved via computer technologies such as data mining and Deep Learning. In addition, the progress of NCD research has been hampered by privacy of health and medical data. In this paper, a hierarchical idea has been proposed to study the effects of various factors on diseases, and a data-driven framework named d-DC with good extensibility is presented. d-DC is able to classify the disease according to the occupation on the premise where the disease is occurring in a certain region. During collecting data, we used a combination of personal or family medical records and traditional methods to build a data acquisition model. Not only can it realize automatic collection and replenishment of data, but it can also effectively tackle the cold start problem of the model with relatively few data effectively. The diversity of information gathering includes structured data and unstructured data (such as plain texts, images or videos), which contributes to improve the classification accuracy and new knowledge acquisition. Apart from adopting machine learning methods, d-DC has employed knowledge graph (KG) to classify diseases for the first time. The vectorization of medical texts by using knowledge embedding is a novel consideration in the classification of diseases. When results are singular, the medical expert system was proposed to address inconsistencies through knowledge bases or online experts. The results of d-DC are displayed by using a combination of KG and traditional methods, which intuitively provides a reasonable interpretation to the results (highly descriptive). Experiments show that d-DC achieved the improved accuracy than the other previous methods. Especially, a fusion method called RKRE based on both ResNet and the expert system attained an average correct proportion of 86.95%, which is a good feasibility study in the field of disease classification. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:534 / 548
页数:15
相关论文
共 50 条
  • [1] Data-Driven Strain Sensor Design Based on a Knowledge Graph Framework
    Ke, Junmin
    Liu, Furong
    Xu, Guofeng
    Liu, Ming
    [J]. SENSORS, 2024, 24 (17)
  • [2] Robust Data-Driven Framework for Driver Behavior Profiling Using Supervised Machine Learning
    Abdelrahman, Abdalla Ebrahim
    Hassanein, Hossam S.
    Abu-Ali, Najah
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (04) : 3336 - 3350
  • [3] Knowledge-based and data-driven underground pressure forecasting based on graph structure learning
    Yue Wang
    Mingsheng Liu
    Yongjian Huang
    Haifeng Zhou
    Xianhui Wang
    Senzhang Wang
    Haohua Du
    [J]. International Journal of Machine Learning and Cybernetics, 2024, 15 : 3 - 18
  • [4] Knowledge-based and data-driven underground pressure forecasting based on graph structure learning
    Wang, Yue
    Liu, Mingsheng
    Huang, Yongjian
    Zhou, Haifeng
    Wang, Xianhui
    Wang, Senzhang
    Du, Haohua
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (01) : 3 - 18
  • [5] A novel Data-Driven framework based on BIM and knowledge graph for automatic model auditing and Quantity Take-off
    Liu, Hao
    Cheng, Jack C. P.
    Gan, Vincent J. L.
    Zhou, Shanjing
    [J]. ADVANCED ENGINEERING INFORMATICS, 2022, 54
  • [6] Data-Driven Consensus Protocol Classification Using Machine Learning
    Marcozzi, Marco
    Filatovas, Ernestas
    Stripinis, Linas
    Paulavicius, Remigijus
    [J]. MATHEMATICS, 2024, 12 (02)
  • [7] A novel data-driven method for product aesthetics evaluating and optimising based on knowledge graph
    Liu, Sha
    Xiang, Zhongxia
    Yao, Haiyun
    Cong, Jingchen
    [J]. JOURNAL OF ENGINEERING DESIGN, 2024,
  • [8] Machine learning-based data-driven robust optimization approach under uncertainty
    Zhang, Chenhan
    Wang, Zhenlei
    Wang, Xin
    [J]. JOURNAL OF PROCESS CONTROL, 2022, 115 : 1 - 11
  • [9] A robust extreme learning machine framework for uncertain data classification
    Jing, Shibo
    Yang, Liming
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (04): : 2390 - 2416
  • [10] Failure Mode Classification of IGBT Modules Under Power Cycling Tests Based on Data-Driven Machine Learning Framework
    Yang, Xin
    Zhang, Yue
    Wu, Xinlong
    Liu, Guoyou
    [J]. IEEE TRANSACTIONS ON POWER ELECTRONICS, 2023, 38 (12) : 16130 - 16141