Causality in statistics and data science education

被引:0
|
作者
Kevin Cummiskey
Karsten Lübke
机构
[1] United States Military Academy,Department of Mathematical Sciences
[2] FOM University of Applied Sciences,ifes Institute for Empirical Research & Statistics
关键词
Statistics education research; Data Science; Causality; Bias and Confounding; A22; C18; C55; C80; C90;
D O I
10.1007/s11943-022-00311-9
中图分类号
学科分类号
摘要
Statisticians and data scientists transform raw data into understanding and insight. Ideally, these insights empower people to act and make better decisions. However, data is often misleading especially when trying to draw conclusions about causality (for example, Simpson’s paradox). Therefore, developing causal thinking in undergraduate statistics and data science programs is important. However, there is very little guidance in the education literature about what topics and learning outcomes, specific to causality, are most important. In this paper, we propose a causality curriculum for undergraduate statistics and data science programs. Students should be able to think causally, which is defined as a broad pattern of thinking that enables individuals to appropriately assess claims of causality based upon statistical evidence. They should understand how the data generating process affects their conclusions and how to incorporate knowledge from subject matter experts in areas of application. Important topics in causality for the undergraduate curriculum include the potential outcomes framework and counterfactuals, measures of association versus causal effects, confounding, causal diagrams, and methods for estimating causal effects.
引用
收藏
页码:277 / 286
页数:9
相关论文
共 50 条
  • [21] Comments on: Data science, big data and statistics
    Ricardo Cao
    [J]. TEST, 2019, 28 : 664 - 670
  • [22] Rejoinder on: Data science, big data and statistics
    Pedro Galeano
    Daniel Peña
    [J]. TEST, 2019, 28 : 363 - 368
  • [23] A Letter to the Journal of Statistics and Data Science Education - A Call for Review of "OkCupid Data for Introductory Statistics and Data Science Courses" by Albert Y. Kim and Adriana Escobedo-Land
    Xiao, Tiffany
    Ma, Yifan
    [J]. JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION, 2021, 29 (02): : 214 - 215
  • [24] A New Era of Learning: Considerations for ChatGPT as a Tool to Enhance Statistics and Data Science Education
    Ellis, Amanda R.
    Slade, Emily
    [J]. JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION, 2023, 31 (02): : 128 - 133
  • [25] The science of statistics versus data science: What is the future?
    Hassani, Hossein
    Beneki, Christina
    Silva, Emmanuel Sirimal
    Vandeput, Nicolas
    Madsen, Dag Oivind
    [J]. TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 173
  • [26] On the relation between data science and statistics
    Meister, Alexander
    [J]. STATISTICS, 2024, 58 (03) : 478 - 480
  • [27] Does Data Science Need Statistics?
    Oxbury, William
    [J]. STATISTICAL DATA SCIENCE, 2018, : 1 - 19
  • [28] Preface: statistics and data science today
    Petrucci, Alessandra
    Racioppi, Filomena
    Verde, Rosanna
    [J]. Springer Proceedings in Mathematics and Statistics, 2019, 288
  • [29] Statistics for Data Science and Policy Analysis
    Hossain, Md Moyazzem
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2021, 184 (04) : 1612 - 1612
  • [30] Statistics and computing: the genesis of data science
    David J. Hand
    [J]. Statistics and Computing, 2015, 25 : 705 - 711