Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine

被引:55
|
作者
Yoo, Changwon [1 ]
Ramirez, Luis [1 ]
Liuzzi, Juan [2 ]
机构
[1] Florida Int Univ, Dept Biostat, Miami, FL 33199 USA
[2] Florida Int Univ, Dept Nutr & Dietet, Miami, FL 33199 USA
关键词
Bayesian analysis; Statistical data interpretation; Systems biology; DIMENSIONALITY REDUCTION METHOD; CELLULAR CONTROL PROCESSES; GENE-GENE INTERACTIONS; BAYESIAN NETWORKS; SYSTEM; CANCER; IDENTIFICATION; MATHEMATICS; DISCOVERY; PATHWAYS;
D O I
10.5213/inj.2014.18.2.50
中图分类号
R5 [内科学]; R69 [泌尿科学(泌尿生殖系疾病)];
学科分类号
1002 ; 100201 ;
摘要
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.
引用
收藏
页码:50 / 57
页数:8
相关论文
共 50 条
  • [21] Predicting the Future - Big Data, Machine Learning, and Clinical Medicine
    Obermeyer, Ziad
    Emanuel, Ezekiel J.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2016, 375 (13): : 1216 - 1219
  • [22] Learning Support Methods based on Predictive Control Using Machine Learning for Educational Big Data
    Abe, Keisuke
    Cheng, Kai
    [J]. 2020 59TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2020, : 1494 - 1499
  • [23] Irregular longitudinal data analysis with statistical and machine learning methods for hazardous asteroids
    Tanriverdi, I.
    Ilk, O.
    Gurkan, M. A.
    [J]. ASTRONOMY AND COMPUTING, 2024, 47
  • [24] Data science and machine learning: Mathematical and statistical methods
    Lai, Yin-Ju
    Hsiao, Chuhsing Kate
    Botev, Zdravko
    [J]. BIOMETRICS, 2021, 77 (04) : 1503 - 1504
  • [25] Collaborative learning with taboos for machine learning methods in big data problems
    Polap, Dawid
    Wozniak, Marcin
    [J]. 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 435 - 441
  • [26] Data Oriented Financial Analysis using Machine Learning Methods
    Altan, Cisem
    Kalayci, Sacide
    Koroglu, Bilge
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2020, : 37 - 41
  • [27] Analysis of machine learning methods to improve efficiency of big data processing in Industry 4.0
    Prudius, A. A.
    Karpunin, A. A.
    Vlasov, A. I.
    [J]. INTERNATIONAL CONFERENCE: INFORMATION TECHNOLOGIES IN BUSINESS AND INDUSTRY, 2019, 1333
  • [28] Editorial to the special issue: Statistical Approaches for Big Data and Machine Learning
    Zhao, Yichuan
    Chen, Chi-Hua
    Feng, Feng
    Pamucar, Dragan
    [J]. JOURNAL OF APPLIED STATISTICS, 2023, 50 (03) : 451 - 455
  • [29] Formatting biological big data for modern machine learning in drug discovery
    Duran-Frigola, Miquel
    Fernandez-Torras, Adria
    Bertoni, Martino
    Aloy, Patrick
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2019, 9 (06)
  • [30] The role of sensors, big data and machine learning in modern animal farming
    Neethirajan, Suresh
    [J]. SENSING AND BIO-SENSING RESEARCH, 2020, 29