Correlation Analysis of Big Data to Support Machine Learning

被引:8
|
作者
Pandey, Rajiv [1 ]
Dhoundiyal, Manoj [2 ]
Kumar, Amrendra [2 ]
机构
[1] Amity Univ, Amity Inst Informat Technol, Lucknow, Uttar Pradesh, India
[2] Amity Univ, IT Dept, Lucknow, Uttar Pradesh, India
关键词
Quantitative Variables; R; Correlation analysis; Big Data; Linear Model; Linear Regression;
D O I
10.1109/CSNT.2015.32
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The large size and complexity of datasets in Big Data need specialized statistical tools for analysis and we use R for correlation analysis of our data set. This paper explores the correlation analysis through best fit linear regression of quantitative variables with help of the demonstration based on scatter plots and linear regression best fit line. The analysis demonstrated in this paper is scalable to Big Data in any other context where the quantitative variables are clearly delineated. R provides multiple techniques and inferences to statistical analysis of dataset, this paper however explores the correlation between quantitative variable establishing the extent of dependability between them using R functions. The correlation and best fit line functions of R i.e. cor() and abline( lmout) respectively are significantly explored.
引用
收藏
页码:996 / 999
页数:4
相关论文
共 50 条
  • [1] Open Data Lake to Support Machine Learning on Arctic Big Data
    Olawoyin, Anifat M.
    Leung, Carson K.
    Cuzzocrea, Alfredo
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5215 - 5224
  • [2] A REVIEW ON THE SIGNIFICANCE OF MACHINE LEARNING FOR DATA ANALYSIS IN BIG DATA
    Kolisetty, Vishnu Vandana
    Rajput, Dharmendra Singh
    JORDANIAN JOURNAL OF COMPUTERS AND INFORMATION TECHNOLOGY, 2020, 6 (01): : 41 - 57
  • [3] Big Universe, Big Data: Machine Learning and Image Analysis for Astronomy
    Kremer, Jan
    Stensbo-Smidt, Kristoffer
    Gieseke, Fabian
    Pedersen, Kim Steenstrup
    Igel, Christian
    IEEE INTELLIGENT SYSTEMS, 2017, 32 (02) : 16 - 22
  • [4] Machine Learning and Integrative Analysis of Biomedical Big Data
    Mirza, Bilal
    Wang, Wei
    Wang, Jie
    Choi, Howard
    Chung, Neo Christopher
    Ping, Peipei
    GENES, 2019, 10 (02)
  • [5] Role of Big Data and Machine Learning in Diagnostic Decision Support in Radiology
    Syeda-Mahmood, Tanveer
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2018, 15 (03) : 569 - 576
  • [6] Cloud Big Data Decision Support System for Machine Learning on AWS
    Kaplunovich, Alex
    Yesha, Yelena
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3508 - 3516
  • [7] Correlation Analysis of Network Big Data and Film Time-Series Data Based on Machine Learning Algorithm
    Li, Na
    Xia, Langbo
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [8] A Hybrid Support Vector Machine Algorithm for Big Data Heterogeneity Using Machine Learning
    Ul Ahsaan, Shafqat
    Kaur, Harleen
    Mourya, Ashish Kumar
    Naaz, Sameena
    SYMMETRY-BASEL, 2022, 14 (11):
  • [9] Correlation Analysis of Network Big Data and Film Time-Series Data Based on Machine Learning Algorithm
    Li, Na
    Xia, Langbo
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [10] Machine Learning in Big Data
    Wang, Lidong
    Alexander, Cheryl Ann
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2016, 1 (02) : 52 - 61