Data leakage jeopardizes ecological applications of machine learning

被引:9
|
作者
Stock, Andy [1 ]
Gregr, Edward J. [1 ,2 ]
Chan, Kai M. A. [1 ]
机构
[1] Univ British Columbia, Inst Resources Environm & Sustainabil, Vancouver, BC, Canada
[2] SciTech Environm Consulting, Vancouver, BC, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
VALIDATION;
D O I
10.1038/s41559-023-02162-1
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Machine learning is a popular tool in ecology but many scientific applications suffer from data leakage, causing misleading results. We highlight common pitfalls in ecological machine-learning methods and argue that discipline-specific model info sheets must be developed to aid in model evaluations.
引用
收藏
页码:1743 / 1745
页数:3
相关论文
共 50 条
  • [1] Data leakage jeopardizes ecological applications of machine learning
    Andy Stock
    Edward J. Gregr
    Kai M. A. Chan
    Nature Ecology & Evolution, 2023, 7 : 1743 - 1745
  • [2] A review of supervised machine learning algorithms and their applications to ecological data
    Crisci, C.
    Ghattas, B.
    Perera, G.
    ECOLOGICAL MODELLING, 2012, 240 : 113 - 122
  • [3] Guiding questions to avoid data leakage in biological machine learning applications
    Bernett, Judith
    Grimm, Dominik G.
    Haselbeck, Florian
    Blumenthal, David B.
    Joeres, Roman
    Kalinina, Olga V.
    List, Markus
    NATURE METHODS, 2024, 21 (08) : 1444 - 1453
  • [4] ON DATA LEAKAGE PREVENTION AND MACHINE LEARNING
    Domnik, Jan
    Holland, Alexander
    35TH BLED ECONFERENCE DIGITAL RESTRUCTURING AND HUMAN (RE)ACTION, BLED ECONFERENCE 2022, 2022, : 695 - 703
  • [5] Applications of symbolic machine learning to ecological modelling
    Dzeroski, S
    ECOLOGICAL MODELLING, 2001, 146 (1-3) : 263 - 273
  • [6] Machine learning of poorly predictable ecological data
    Shan, Y.
    Paull, D.
    McKay, R. I.
    ECOLOGICAL MODELLING, 2006, 195 (1-2) : 129 - 138
  • [7] Information Leakage from Data Updates in Machine Learning Models
    Hui, Tian
    Farokhi, Farhad
    Ohrimenko, Olga
    PROCEEDINGS OF THE 16TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2023, 2023, : 35 - 41
  • [8] Editorial: Analysis and synthesis of ecological data by machine learning
    Recknagel, Friedrich
    Staiano, Antonino
    ECOLOGICAL INFORMATICS, 2019, 53
  • [9] Machine Learning Applications for the Prediction of Bone Cement Leakage in Percutaneous Vertebroplasty
    Li, Wenle
    Wang, Jiaming
    Liu, Wencai
    Xu, Chan
    Li, Wanying
    Zhang, Kai
    Su, Shibin
    Li, Rong
    Hu, Zhaohui
    Liu, Qiang
    Lu, Ruogu
    Yin, Chengliang
    FRONTIERS IN PUBLIC HEALTH, 2021, 9
  • [10] Special issue: Applications of machine learning to ecological modelling - Preface
    Recknagel, F
    ECOLOGICAL MODELLING, 2001, 146 (1-3) : 1 - 2