Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice

被引:3
|
作者
Li, Dan [1 ]
机构
[1] CUNY, Baruch Coll, Philosophy Dept, New York, NY 10031 USA
关键词
Induction; Machine learning; Data ontology; No Free Lunch theorem; Goodman's riddle of induction; CLIMATE; MODELS;
D O I
10.1007/s11023-023-09639-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As scientists start to adopt machine learning (ML) as one research tool, the security of ML and the knowledge generated become a concern. In this paper, I explain how supervised ML can be improved with better data ontology, or the way we make categories and turn information into data. More specifically, we should design data ontology in such a way that is consistent with the knowledge that we have about the target phenomenon so that such ontology can help us make the inductive leap. I do so by thinking through a thought experiment, Goodman's New Riddle of Induction (Fact, fiction, and forecast, Harvard University Press, 1955). Goodman's riddle helps flesh out three problems of induction: (1) the problem of equal goodies, that there are often too many equally good inductive results given the same data; (2) the problem of diverging performance, that these equally good results can give opposite predictions in the future; and (3) the problem of mediocrity, that when averaged across all equally possible datasets and tasks, no inductive algorithm outperforms any other. I show that all these three problems are manifested as real obstacles in ML practice, namely, the Rashomon effect (Breiman in Stat Sci 16(3):199-231, 2001), the problem of underspecification (D'Amour et al. in J Mach Learn Res, 2020, https://doi.org/10.48550/arXiv.2011.03395), and the No Free Lunch theorem (Wolpert in Neural Comput 8(7):1341-90, 1996, https://doi.org/10.1162/neco.1996.8.7. 1341). Lastly, I argue that proper data ontology can help mitigate these problems and I demonstrate how using concrete examples from climate science. This research highlights the links between philosophers' discussions of induction and implications in ML practice.
引用
收藏
页码:429 / 450
页数:22
相关论文
共 50 条
  • [1] Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice
    Dan Li
    Minds and Machines, 2023, 33 : 429 - 450
  • [2] When Is It Better to Learn Together? Insights from Research on Collaborative Learning
    Timothy J. Nokes-Malach
    J. Elizabeth Richey
    Soniya Gadgil
    Educational Psychology Review, 2015, 27 : 645 - 656
  • [3] Lessons for Better Pain Management in the Future: Learning from the Past
    Manchikanti, Laxmaiah
    Singh, Vanila
    Kaye, Alan D.
    Hirsch, Joshua A.
    PAIN AND THERAPY, 2020, 9 (02) : 373 - 391
  • [4] When Is It Better to Learn Together? Insights from Research on Collaborative Learning
    Nokes-Malach, Timothy J.
    Richey, J. Elizabeth
    Gadgil, Soniya
    EDUCATIONAL PSYCHOLOGY REVIEW, 2015, 27 (04) : 645 - 656
  • [5] More technology, better learning resources, better learning? Lessons from adopting virtual microscopy in undergraduate medical education
    Helle, Laura
    Nivala, Markus
    Kronqvist, Pauliina
    ANATOMICAL SCIENCES EDUCATION, 2013, 6 (02) : 73 - 80
  • [6] Foresight for a better African future: Lessons from six decades of practice
    Adesida, Olugbenga
    Gatune, Julius
    Eyakuze, Aidan
    DEVELOPMENT POLICY REVIEW, 2024, 42
  • [7] Lessons from Research to Practice on Writing Better Quality Puppet Scripts
    Rahman, Akond
    Sharma, Tushar
    2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022), 2022, : 63 - 67
  • [8] Can machine learning on economic data better forecast the unemployment rate?
    Kreiner, Aaron
    Duca, John V.
    APPLIED ECONOMICS LETTERS, 2020, 27 (17) : 1434 - 1437
  • [9] Cognition-Enhanced Machine Learning for Better Predictions with Limited Data
    Sense, Florian
    Wood, Ryan
    Collins, Michael G.
    Fiechter, Joshua
    Wood, Aihua
    Krusmark, Michael
    Jastrzembski, Tiffany
    Myers, Christopher W.
    TOPICS IN COGNITIVE SCIENCE, 2022, 14 (04) : 739 - 755
  • [10] Integration of Philosophy of Science in Biomedical Data Science Education to Foster Better Scientific Practice
    Annelies Pieterman-Bos
    Marc H. W. van Mil
    Science & Education, 2023, 32 : 1709 - 1738