Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities

被引:325
|
作者
Zitnik, Marinka [1 ]
Nguyen, Francis [2 ,3 ]
Wang, Bo [4 ]
Leskovec, Jure [1 ,5 ]
Goldenberg, Anna [6 ,7 ,8 ]
Hoffman, Michael M. [2 ,3 ,7 ,8 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[3] Princess Margaret Canc Ctr, Toronto, ON, Canada
[4] Hikvis Res Inst, Santa Clara, CA USA
[5] Chan Zuckerberg Biohub, San Francisco, CA 94158 USA
[6] SickKids Res Inst, Genet & Genome Biol, Toronto, ON, Canada
[7] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
[8] Vector Inst, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
Computational biology; Personalized medicine; Systems biology; Heterogeneous data; Machine learning; DRUG-DRUG INTERACTION; GENOME-WIDE ASSOCIATION; DNA METHYLATION; DATA FUSION; TRANSCRIPTION FACTORS; CHROMATIN-STATE; CHIP-SEQ; PROBABILISTIC FUNCTIONS; MULTICELLULAR FUNCTION; HETEROGENEOUS NETWORK;
D O I
10.1016/j.inffus.2018.09.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include a myriad of properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we describe the principles of data integration and discuss current methods and available implementations. We provide examples of successful data integration in biology and medicine. Finally, we discuss current challenges in biomedical integrative methods and our perspective on the future development of the field.
引用
收藏
页码:71 / 91
页数:21
相关论文
共 50 条
  • [1] Opportunities and obstacles for deep learning in biology and medicine
    Ching, Travers
    Himmelstein, Daniel S.
    Beaulieu-Jones, Brett K.
    Kalinin, Alexandr A.
    Do, Brian T.
    Way, Gregory P.
    Ferrero, Enrico
    Agapow, Paul-Michael
    Zietz, Michael
    Hoffman, Michael M.
    Xie, Wei
    Rosen, Gail L.
    Lengerich, Benjamin J.
    Israeli, Johnny
    Lanchantin, Jack
    Woloszynek, Stephen
    Carpenter, Anne E.
    Shrikumar, Avanti
    Xu, Jinbo
    Cofer, Evan M.
    Lavender, Christopher A.
    Turaga, Srinivas C.
    Alexandari, Amr M.
    Lu, Zhiyong
    Harris, David J.
    DeCaprio, Dave
    Qi, Yanjun
    Kundaje, Anshul
    Peng, Yifan
    Wiley, Laura K.
    Segler, Marwin H. S.
    Boca, Simina M.
    Swamidass, S. Joshua
    Huang, Austin
    Gitter, Anthony
    Greene, Casey S.
    JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2018, 15 (141)
  • [2] Integrating recovery principles into practice: Exploring the opportunities and challenges [Workshop]
    Hungerford, Catherine
    Hodgson, Donna
    INTERNATIONAL JOURNAL OF MENTAL HEALTH NURSING, 2010, 19 : A20 - A21
  • [3] Data Learning: Integrating Data Assimilation and Machine Learning
    Buizza, Caterina
    Casas, Cesar Quilodran
    Nadler, Philip
    Mack, Julian
    Marrone, Stefano
    Titus, Zainab
    Le Cornec, Clemence
    Heylen, Evelyn
    Dur, Tolga
    Ruiz, Luis Baca
    Heaney, Claire
    Lopez, Julio Amador Diaz
    Kumar, K. S. Sesh
    Arcucci, Rossella
    JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 58
  • [4] Principles and Practice of Explainable Machine Learning
    Belle, Vaishak
    Papantonis, Ioannis
    FRONTIERS IN BIG DATA, 2021, 4
  • [5] Realistically Integrating Machine Learning Into Clinical Practice: A Road Map of Opportunities, Challenges, and a Potential Future
    Hofer, Ira S.
    Burns, Michael
    Kendale, Samir
    Wanderer, Jonathan P.
    ANESTHESIA AND ANALGESIA, 2020, 130 (05): : 1115 - 1118
  • [6] Theory and Practice of Integrating Machine Learning and Conventional Statistics in Medical Data Analysis
    Dhillon, Sarinder Kaur
    Ganggayah, Mogana Darshini
    Sinnadurai, Siamala
    Lio, Pietro
    Taib, Nur Aishah
    DIAGNOSTICS, 2022, 12 (10)
  • [7] Machine Learning and Network Methods for Biology and Medicine
    Chen, Lei
    Huang, Tao
    Lu, Chuan
    Lu, Lin
    Li, Dandan
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2015, 2015
  • [8] Machine Learning Framework for Classification in Medicine and Biology
    Lee, Kva K.
    INTEGRATION OF AI AND OR TECHNIQUES IN CONSTRAINT PROGRAMMING FOR COMBINATORIAL OPTIMIZATION PROBLEMS, PROCEEDINGS, 2009, 5547 : 1 - 7
  • [9] Machine learning in postgenomic biology and personalized medicine
    Ray, Animesh
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 12 (02)
  • [10] Ocean of Data: Integrating First-Principles Calculations and CALPHAD Modeling with Machine Learning
    Liu, Zi-Kui
    JOURNAL OF PHASE EQUILIBRIA AND DIFFUSION, 2018, 39 (05) : 635 - 649