Big Data: Mining of Log File through Hadoop

被引:0
|
作者
Kotiyal, Bina [1 ]
Kumar, Ankit [2 ]
Pant, Bhaskar [2 ]
Goudar, R. H. [1 ]
机构
[1] GEU Univ, Dept Comp Sci, Bell Rd, Dehra Dun, Uttarakhand, India
[2] GEU Univ, Dept Informat Technol, Bell Rd, Dehra Dun, Uttarakhand, India
关键词
Big Data; Web Log File; Hadoop; Hadoop Distributed File System; Distributed Processing; MapReduce;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The unremitting increase of computational strength has produced tremendous flow of data in the past two decades. This tremendous flow of data is known as "big data". Big data is the data which cannot be processed with the aid of existing tools or techniques and if processed can result in interesting information's such as analysing the behaviour of the user, business intelligence etc. This paper discusses the difference between the traditional relational database and big data; it also shows the characteristics of big data. The paper also focuses on the distinct big data channels processes along with the various challenges and as well as on how big data is a solution to the organizations. Big data does not only focus to store and handle the large volume of data but also to analysed and extract the correct information from the data in lesser time span. At last it discusses about hadoop an open source framework that allows the distributed processing for massive datasets on cluster of computers which is shown with using the log file for extraction of information based on user query.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Micro-mining and segmented log file analysis: a method for enriching the data yield from Internet log files
    Nicholas, D
    Huntington, P
    [J]. JOURNAL OF INFORMATION SCIENCE, 2003, 29 (05) : 391 - 404
  • [32] Handling Big Data with Hadoop Toolkit
    Devakunchari, R.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [33] Big data and Spark: Comparison with Hadoop
    Benlachmi, Yassine
    Hasnaoui, Moulay Lahcen
    [J]. PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 811 - 817
  • [34] Big Data and Hadoop -A Technological Survey
    Manwal, Manika
    Gupta, Amit
    [J]. 2017 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMPUTING AND COMMUNICATION TECHNOLOGIES (ICETCCT), 2017, : 268 - 273
  • [35] Hadoop: Addressing Challenges of Big Data
    Singh, Kamalpreet
    Kaur, Ravinder
    [J]. SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 686 - 689
  • [36] A Review on Big Data and Hadoop Security
    Khaloufi, Hayat
    Beni-Hssane, Abderrahim
    Abouelmehdi, Karim
    Saadi, Mostafa
    [J]. Networked Systems, NETYS 2016, 2016, 9944 : 386 - 386
  • [37] Role of Hadoop in Big Data Handling
    Meenakshi
    Ramachandra, A. C.
    Thippeswamy, M. N.
    Bailakare, Ajith
    [J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 482 - 491
  • [38] Beyond Hadoop for e-commerce Big Data Analysis through Amazon
    Verma, Ankush
    Sethi, Nidhi
    Jain, Neelesh
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATION AND TELECOMMUNICATION (ICACAT), 2018,
  • [39] Impact of log-file source and data frequency on the accuracy of log file based PSQA
    Azzi, A.
    Heilemann, G.
    Georg, D.
    Pawiro, S. A.
    Mart, T.
    Lechner, W.
    [J]. RADIOTHERAPY AND ONCOLOGY, 2023, 182 : S1424 - S1424
  • [40] An enhancement of data locality in Hadoop distributed file system
    Reddy, A. Siva Krishna
    Sujatha, Pothula
    Koti, Prasad
    Dhavachelvan, P.
    Amudhavel, J.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2018, 11 (01): : 123 - 133