Interpretable Machine Learning for Discovery: Statistical Challenges and Opportunities

被引:7
|
作者
Allen, Genevera I. [1 ,2 ,3 ,4 ]
Gan, Luqin [2 ]
Zheng, Lili [1 ]
机构
[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA
[2] Rice Univ, Dept Stat, Houston, TX 77005 USA
[3] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
[4] Baylor Coll Med, Neurol Res Inst, Houston, TX 77030 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
machine learning; interpretability; explainability; data-driven discoveries; validation; stability; selection consistency; uncertainty quantification; VARIABLE SELECTION; CONSISTENCY; VALIDATION; MODELS;
D O I
10.1146/annurev-statistics-040120-030919
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
New technologies have led to vast troves of large and complex data sets across many scientific domains and industries. People routinely use machine learning techniques not only to process, visualize, and make predictions from these big data, but also to make data-driven discoveries. These discoveries are often made using interpretable machine learning, or machine learning models and techniques that yield human-understandable insights. In this article, we discuss and review the field of interpretable machine learning, focusing especially on the techniques, as they are often employed to generate new knowledge or make discoveries from large data sets.We outline the types of discoveries that can be made using interpretable machine learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner, which promotes trust in machine learning systems and reproducibility in science.We discuss validation both from a practical perspective, reviewing approaches based on data-splitting and stability, as well as from a theoretical perspective, reviewing statistical results on model selection consistency and uncertainty quantification via statistical inference. Finally, we conclude by highlighting open challenges in using interpretable machine learning techniques to make discoveries, including gaps between theory and practice for validating data-driven discoveries.
引用
收藏
页码:97 / 121
页数:25
相关论文
共 50 条
  • [31] Opportunities and Challenges Of Machine Learning Accelerators In Production
    Ananthanarayanan, Rajagopal
    Brandt, Peter
    Joshi, Manasi
    Sathiamoorthy, Maheswaran
    PROCEEDINGS OF THE 2019 USENIX CONFERENCE ON OPERATIONAL MACHINE LEARNING, 2019, : 1 - 3
  • [32] Advancements of Machine Learning in Healthcare: Opportunities and Challenges
    Karras, Dimitrios A.
    INTERNATIONAL JOURNAL OF PSYCHIATRY IN MEDICINE, 2025, 60 (2_SUPPL): : 9S - 10S
  • [33] Antimicrobial Resistance and Machine Learning: Challenges and Opportunities
    Elyan, Eyad
    Hussain, Amir
    Sheikh, Aziz
    Elmanama, Abdelraouf A.
    Vuttipittayamongkol, Pattaramon
    Hijazi, Karolin
    IEEE ACCESS, 2022, 10 : 31561 - 31577
  • [34] Machine learning for media compression: challenges and opportunities
    Said, Amir
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2018, 7 : 1 - 11
  • [35] Agricultural Knowledge Discovery Based on Interpretable Machine Learning and Visual Analytics
    Hsu, Hao-Hsuan
    Huang, Nen-Fu
    Tsai, Woan-Yuh
    Zhan, Ting-Zhu
    Huang, Chia-Lin
    Wu, Hsin-Mao
    2024 4TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE, CCAI 2024, 2024, : 231 - 237
  • [36] Towards multimodal biomarker discovery of sleep deprivation with interpretable machine learning
    Stucky, Benjamin
    Scholz, Michael
    Lakamper, Stefan
    Keller, Kristina
    Kraemer, Thomas
    Landolt, Hans-Peter
    JOURNAL OF SLEEP RESEARCH, 2024, 33
  • [37] Materials Discovery through Machine Learning: Experimental Validation and Interpretable Models
    Mar, Arthur
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2023, 79 : A32 - A32
  • [38] Interpretable Machine Learning
    Chen V.
    Li J.
    Kim J.S.
    Plumb G.
    Talwalkar A.
    Queue, 2021, 19 (06): : 28 - 56
  • [39] Applications of machine learning in antibody discovery, process development, manufacturing and formulation: Current trends, challenges, and opportunities
    Khuat, Thanh Tung
    Bassett, Robert
    Otte, Ellen
    Grevis-James, Alistair
    Gabrys, Bogdan
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 182
  • [40] Opportunities and challenges in interpretable deep learning for drug sensitivity prediction of cancer cells
    Samal, Bikash Ranjan
    Loers, Jens Uwe
    Vermeirssen, Vanessa
    De Preter, Katleen
    FRONTIERS IN BIOINFORMATICS, 2022, 2