Importance of Statistics for Data Mining and Data Science

被引:2
|
作者
Ribeiro, Vitor [1 ]
Rocha, Andre [1 ]
Peixoto, Rui [1 ]
Portela, Filipe [1 ]
Santos, Manuel Filipe [1 ]
机构
[1] Univ Minho, Algoritmi Res Ctr, Braga, Portugal
关键词
Data Mining; Statistics; Statistical Analysis; Data Science;
D O I
10.1109/FiCloudW.2017.86
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge has been significantly recognized by managers as an important asset for organizations. This recognition stems from the fact that knowledge is increasingly used as a strategic resource to create competitive advantage, improve organizational processes, reduce costs, and more. Data Mining (DM) is an area of study that facilitates that process, allowing you to extract useful information and predictions from the vast data sets produced by the company. With the help of statistics and their mathematical methods, DM has gradually become important and useful. Some of the main statistical metrics used to perform data analysis are mean, median, variance, standard deviation, variance analysis, correlation and regression. This study aims to highlight and prove the importance of statistics in DM, which has so much potential in terms of creating a competitive advantage on behalf of the companies. A case study using Intensive Care Medicine data were chosen to prove the importance of statistics for Data Mining.
引用
收藏
页码:156 / 163
页数:8
相关论文
共 50 条
  • [1] Data mining at the interface of computer science and statistics
    Smyth, P
    [J]. DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 35 - 61
  • [2] Studies on the Application of Data-mining in the Science of Statistics
    Hao, Jianmin
    Sun, Aijun
    [J]. PROCEEDINGS OF THE 6TH CONFERENCE OF BIOMATHEMATICS, VOLS I AND II: ADVANCES ON BIOMATHEMATICS, 2008, : 245 - 247
  • [3] Statistics, data science, and big data
    Kauermann G.
    Küchenhoff H.
    [J]. AStA Wirtschafts- und Sozialstatistisches Archiv, 2016, 10 (2-3) : 141 - 150
  • [4] The importance of data mining for conservation science: a case study on the wolverine
    Gallant, Daniel
    Gauvin, Lindsay Y.
    Berteaux, Dominique
    Lecomte, Nicolas
    [J]. BIODIVERSITY AND CONSERVATION, 2016, 25 (13) : 2629 - 2639
  • [5] Data science, big data and statistics
    Galeano, Pedro
    Pena, Daniel
    [J]. TEST, 2019, 28 (02) : 289 - 329
  • [6] The importance of data mining for conservation science: a case study on the wolverine
    Daniel Gallant
    Lindsay Y. Gauvin
    Dominique Berteaux
    Nicolas Lecomte
    [J]. Biodiversity and Conservation, 2016, 25 : 2629 - 2639
  • [7] The Lure of Statistics in Data Mining
    Grover, Lovleen Kumar
    Mehra, Rajni
    [J]. JOURNAL OF STATISTICS EDUCATION, 2008, 16 (01):
  • [8] Data Mining and Statistics — Introduction
    Heike Hofmann
    Antony Unwin
    Adalbert Wilhem
    [J]. Computational Statistics, 2001, 16 : 317 - 321
  • [9] Data mining and statistics - Introduction
    Hofmann, H
    Unwin, A
    Wilhelm, A
    [J]. COMPUTATIONAL STATISTICS, 2001, 16 (03) : 317 - 321
  • [10] Data mining: Statistics and more?
    Hand, DJ
    [J]. AMERICAN STATISTICIAN, 1998, 52 (02): : 112 - 118