Data Science: A Comprehensive Overview

被引:192
|
作者
Cao, Longbing [1 ]
机构
[1] Univ Technol Sydney, Fac Engn & IT, UTS Adv Analyt Inst, POB 123 Broadway, Sydney, NSW 2007, Australia
基金
澳大利亚研究理事会;
关键词
Big data; data analysis; data analytics; advanced analytics; big data analytics; data science; data engineering; data scientist; statistics; computing; informatics; data DNA; data innovation; data economy; data industry; data service; data profession; data education; BIG DATA; STATISTICS; ANALYTICS; FUTURE; GUIDE;
D O I
10.1145/3076253
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The 21st century has ushered in the age of big data and data economy, in which data DNA, which carries important knowledge, insights, and potential, has become an intrinsic constituent of all data-based organisms. An appropriate understanding of data DNA and its organisms relies on the new field of data science and its keystone, analytics. Although it is widely debated whether big data is only hype and buzz, and data science is still in a very early phase, significant challenges and opportunities are emerging or have been inspired by the research, innovation, business, profession, and education of data science. This article provides a comprehensive survey and tutorial of the fundamental aspects of data science: the evolution from data analysis to data science, the data science concepts, a big picture of the era of data science, the major challenges and directions in data innovation, the nature of data analytics, new industrialization and service opportunities in the data economy, the profession and competency of data education, and the future of data science. This article is the first in the field to draw a comprehensive big picture, in addition to offering rich observations, lessons, and thinking about data science and analytics.
引用
收藏
页数:42
相关论文
共 50 条
  • [42] Practical Privacy-Preserving Data Science With Homomorphic Encryption: An Overview
    Iezzi, Michela
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3979 - 3988
  • [43] Patent citation data in social science research: Overview and best practices
    Jaffe, Adam B.
    de Rassenfosse, Gaetan
    [J]. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2017, 68 (06) : 1360 - 1374
  • [44] Health Misinformation Detection in the Social Web: An Overview and a Data Science Approach
    Di Sotto, Stefano
    Viviani, Marco
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (04)
  • [45] Interactions from diffraction data:: historical and comprehensive overview of simulation assisted methods
    Toth, Gergely
    [J]. JOURNAL OF PHYSICS-CONDENSED MATTER, 2007, 19 (33)
  • [46] A Comprehensive Framework to Enhance the Effectiveness of the Recruiting Experience for Data Science Graduates
    Triche, Jason
    Firth, David
    Harrington, Michael
    [J]. COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS, 2016, 39
  • [47] A Comprehensive Review of Healthcare Prediction using Data Science with Deep Learning
    Thandu, Asha Latha
    Gera, Pradeepini
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (12) : 657 - 669
  • [48] Overview of the Science of Science Policy Symposium
    Lane, Julia
    Black, Dan
    [J]. JOURNAL OF POLICY ANALYSIS AND MANAGEMENT, 2012, 31 (03) : 598 - 600
  • [49] How big is Big Data? A comprehensive survey of data production, storage, and streaming in science and industry
    Clissa, Luca
    Lassnig, Mario
    Rinaldi, Lorenzo
    [J]. FRONTIERS IN BIG DATA, 2023, 6
  • [50] HydroShare retrospective: Science and technology advances of a comprehensive data and model publication environment for the water science domain
    Tarboton, David G.
    Ames, Daniel P.
    Horsburgh, Jeffery S.
    Goodall, Jonathan L.
    Couch, Alva
    Hooper, Richard
    Bales, Jerad
    Wang, Shaowen
    Castronova, Anthony
    Seul, Martin
    Idaszak, Ray
    Li, Zhiyu
    Dash, Pabitra
    Black, Scott
    Ramirez, Maurier
    Yi, Hong
    Calloway, Chris
    Cogswell, Clara
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2024, 172