Learning from data with structured missingness

被引:0
|
作者
Robin Mitra
Sarah F. McGough
Tapabrata Chakraborti
Chris Holmes
Ryan Copping
Niels Hagenbuch
Stefanie Biedermann
Jack Noonan
Brieuc Lehmann
Aditi Shenvi
Xuan Vinh Doan
David Leslie
Ginestra Bianconi
Ruben Sanchez-Garcia
Alisha Davies
Maxine Mackintosh
Eleni-Rosalina Andrinopoulou
Anahid Basiri
Chris Harbron
Ben D. MacArthur
机构
[1] The Alan Turing Institute,Statistical Science
[2] University College London,Department of Medical Physics & Biomedical Engineering and UCL Cancer Institute
[3] Genentech,Department of Statistics
[4] University College London,School of Mathematics and Statistics
[5] University of Oxford,School of Mathematics
[6] F. Hoffmann-La Roche AG,Department of Statistics
[7] The Open University,Warwick Business School
[8] Cardiff University,The Digital Environment Research Institute
[9] University of Warwick,School of Mathematical Sciences
[10] University of Warwick,Mathematical Sciences
[11] Queen Mary University of London,Faculty of Health and Life Sciences
[12] Queen Mary University of London,Department of Biostatistics and Department of Epidemiology
[13] University of Southampton,School of Geographical & Earth Sciences
[14] Swansea University,Faculty of Medicine
[15] Public Health Wales,undefined
[16] Genomics England,undefined
[17] Erasmus MC,undefined
[18] University of Glasgow,undefined
[19] Roche Pharmaceuticals,undefined
[20] University of Southampton,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Missing data are an unavoidable complication in many machine learning tasks. When data are ‘missing at random’ there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such ‘structured missingness’ raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
引用
收藏
页码:13 / 23
页数:10
相关论文
共 50 条
  • [31] Robust Structured Subspace Learning for Data Representation
    Li, Zechao
    Liu, Jing
    Tang, Jinhui
    Lu, Hanqing
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (10) : 2085 - 2098
  • [32] Learning Propagation for Arbitrarily-structured Data
    Liu, Sifei
    Li, Xueting
    Jampani, Varun
    De Mello, Shalini
    Kautz, Jan
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 652 - 661
  • [33] DQLearn : A Toolkit for Structured Data Quality Learning
    Shrivastava, Shrey
    Patel, Dhaval
    Zhou, Nianjun
    Iyengar, Arun
    Bhamidipaty, Anuradha
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 1644 - 1653
  • [34] Learning to classify structured data by graph propositionalization
    Karunaratne, Thashmee
    Bostrom, Henrik
    [J]. PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 393 - +
  • [35] Exploiting Data Missingness in Bayesian Network Modeling
    de Morais, Sergio Rodrigues
    Aussem, Alex
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS VIII, PROCEEDINGS, 2009, 5772 : 35 - +
  • [36] Distance metric learning for graph structured data
    Tomoki Yoshida
    Ichiro Takeuchi
    Masayuki Karasuyama
    [J]. Machine Learning, 2021, 110 : 1765 - 1811
  • [37] Morphism-Based Learning for Structured Data
    Shin, Kilho
    Shepard, David Lawrence
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5767 - 5775
  • [38] Learning using structured data: Application to fMRI data analysis
    Liang, Lichen
    Cherkassky, Vladimir
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 495 - +
  • [39] Computation of Individual Latent Variable Scores from Data with Multiple Missingness Patterns
    D. D. Campbell
    F. V. Rijsdijk
    P. C. Sham
    [J]. Behavior Genetics, 2007, 37 : 408 - 422
  • [40] Computation of individual latent variable scores from data with multiple missingness patterns
    Campbell, D. D.
    Rijsdijk, F. V.
    Sham, P. C.
    [J]. BEHAVIOR GENETICS, 2007, 37 (02) : 408 - 422