Interpretable Machine Learning for Discovery: Statistical Challenges and Opportunities

被引:4
|
作者
Allen, Genevera I. [1 ,2 ,3 ,4 ]
Gan, Luqin [2 ]
Zheng, Lili [1 ]
机构
[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA
[2] Rice Univ, Dept Stat, Houston, TX 77005 USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
[4] Baylor Coll Med, Neurol Res Inst, Houston, TX 77030 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
machine learning; interpretability; explainability; data-driven discoveries; validation; stability; selection consistency; uncertainty quantification; VARIABLE SELECTION; CONSISTENCY; VALIDATION; MODELS;
D O I
10.1146/annurev-statistics-040120-030919
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
New technologies have led to vast troves of large and complex data sets across many scientific domains and industries. People routinely use machine learning techniques not only to process, visualize, and make predictions from these big data, but also to make data-driven discoveries. These discoveries are often made using interpretable machine learning, or machine learning models and techniques that yield human-understandable insights. In this article, we discuss and review the field of interpretable machine learning, focusing especially on the techniques, as they are often employed to generate new knowledge or make discoveries from large data sets.We outline the types of discoveries that can be made using interpretable machine learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner, which promotes trust in machine learning systems and reproducibility in science.We discuss validation both from a practical perspective, reviewing approaches based on data-splitting and stability, as well as from a theoretical perspective, reviewing statistical results on model selection consistency and uncertainty quantification via statistical inference. Finally, we conclude by highlighting open challenges in using interpretable machine learning techniques to make discoveries, including gaps between theory and practice for validating data-driven discoveries.
引用
下载
收藏
页码:97 / 121
页数:25
相关论文
共 50 条
  • [21] Machine Learning and Brain Imaging: Opportunities and Challenges
    Paulus, Martin P.
    Kuplicki, Rayus
    Yeh, Hung-Wen
    TRENDS IN NEUROSCIENCES, 2019, 42 (10) : 659 - 661
  • [22] Machine learning on big data: Opportunities and challenges
    Zhou, Lina
    Pan, Shimei
    Wang, Jianwu
    Vasilakos, Athanasios V.
    NEUROCOMPUTING, 2017, 237 : 350 - 361
  • [23] Opportunities and Challenges for Machine Learning in Rare Diseases
    Decherchi, Sergio
    Pedrini, Elena
    Mordenti, Marina
    Cavalli, Andrea
    Sangiorgi, Luca
    FRONTIERS IN MEDICINE, 2021, 8
  • [24] Machine learning in sports science: challenges and opportunities
    Richter, Chris
    O'Reilly, Martin
    Delahunt, Eamonn
    SPORTS BIOMECHANICS, 2024, 23 (08) : 961 - 967
  • [25] Machine Learning and Ecosystem Informatics: Challenges and Opportunities
    Dietterich, Thomas G.
    ADVANCES IN MACHINE LEARNING, PROCEEDINGS, 2009, 5828 : 1 - 5
  • [26] Machine Learning for Precision Psychiatry: Opportunities and Challenges
    Bzdok, Danilo
    Meyer-Lindenberg, Andreas
    BIOLOGICAL PSYCHIATRY-COGNITIVE NEUROSCIENCE AND NEUROIMAGING, 2018, 3 (03) : 223 - 230
  • [27] Machine learning in medical imaging: challenges and opportunities
    De Bruijne, M.
    RADIOTHERAPY AND ONCOLOGY, 2018, 127 : S9 - S9
  • [28] Antimicrobial Resistance and Machine Learning: Challenges and Opportunities
    Elyan, Eyad
    Hussain, Amir
    Sheikh, Aziz
    Elmanama, Abdelraouf A.
    Vuttipittayamongkol, Pattaramon
    Hijazi, Karolin
    IEEE ACCESS, 2022, 10 : 31561 - 31577
  • [29] Machine learning in computational histopathology: Challenges and opportunities
    Cooper, Michael
    Ji, Zongliang
    Krishnan, Rahul G.
    GENES CHROMOSOMES & CANCER, 2023, 62 (09): : 540 - 556
  • [30] Opportunities and Challenges Of Machine Learning Accelerators In Production
    Ananthanarayanan, Rajagopal
    Brandt, Peter
    Joshi, Manasi
    Sathiamoorthy, Maheswaran
    PROCEEDINGS OF THE 2019 USENIX CONFERENCE ON OPERATIONAL MACHINE LEARNING, 2019, : 1 - 3