Big Data: Mining of Log File through Hadoop

被引:0
|
作者
Kotiyal, Bina [1 ]
Kumar, Ankit [2 ]
Pant, Bhaskar [2 ]
Goudar, R. H. [1 ]
机构
[1] GEU Univ, Dept Comp Sci, Bell Rd, Dehra Dun, Uttarakhand, India
[2] GEU Univ, Dept Informat Technol, Bell Rd, Dehra Dun, Uttarakhand, India
关键词
Big Data; Web Log File; Hadoop; Hadoop Distributed File System; Distributed Processing; MapReduce;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The unremitting increase of computational strength has produced tremendous flow of data in the past two decades. This tremendous flow of data is known as "big data". Big data is the data which cannot be processed with the aid of existing tools or techniques and if processed can result in interesting information's such as analysing the behaviour of the user, business intelligence etc. This paper discusses the difference between the traditional relational database and big data; it also shows the characteristics of big data. The paper also focuses on the distinct big data channels processes along with the various challenges and as well as on how big data is a solution to the organizations. Big data does not only focus to store and handle the large volume of data but also to analysed and extract the correct information from the data in lesser time span. At last it discusses about hadoop an open source framework that allows the distributed processing for massive datasets on cluster of computers which is shown with using the log file for extraction of information based on user query.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Hadoop Distributed File System for Big data analysis
    Almansouri, Hatim Talal
    Masmoudi, Youssef
    [J]. PROCEEDINGS OF 2019 IEEE 4TH WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS' 19), 2019, : 257 - 261
  • [2] Data Preprocessing Method on Data Mining of Web Log File
    Li, Jia
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL AND INFORMATION SCIENCES (ICCIS 2014), 2014, : 712 - 717
  • [3] Big Data Mining: In-Database Oracle Data Mining over Hadoop
    Kovacheva, Zlatinka
    Naydenova, Ina
    Kaloyanova, Kalinka
    Markov, Krasimir
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2016 (ICNAAM-2016), 2017, 1863
  • [4] Anomaly Detection for Big Log Data Using a Hadoop Ecosystem
    Son, Siwoon
    Gil, Myeong-Seon
    Moon, Yang-Sae
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2017, : 377 - 380
  • [5] A Big Data Framework for Mining Sensor Data Using Hadoop
    El-Shafeiy, Engy A.
    El-Desouky, Ali I.
    [J]. STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (03): : 365 - 376
  • [6] Text Mining For Educational Literature On Big Data With Hadoop
    Wang, Haoge
    Wang, Quanyu
    Wang, Wenming
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2018, : 166 - 170
  • [7] Optimization strategy of Hadoop small file storage for big data in healthcare
    Hui He
    Zhonghui Du
    Weizhe Zhang
    Allen Chen
    [J]. The Journal of Supercomputing, 2016, 72 : 3696 - 3707
  • [8] An approach for Big Data Security based on Hadoop Distributed File system
    Mahmoud, Hadeer
    Hegazy, Abdelfatah
    Khafagy, Mohamed H.
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMPUTER ENGINEERING (ITCE' 2018), 2018, : 109 - 114
  • [9] Optimization strategy of Hadoop small file storage for big data in healthcare
    He, Hui
    Du, Zhonghui
    Zhang, Weizhe
    Chen, Allen
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (10): : 3696 - 3707
  • [10] Mass Log Data Processing and Mining Based on Hadoop and Cloud Computing
    Yu, Hongyong
    Wang, Deshuai
    [J]. PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 197 - 202