Efficient Handling of Heterogeneous File Formats in HDFS

被引:0
|
作者
Prashant, More Vaishali [1 ]
Raut, Suhas D. [1 ]
机构
[1] NK Orchid Coll Engn & Tech, Dept Comp Sci & Engn, Solapur, Maharashtra, India
关键词
Big Data; Hadoop; HDFS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The amount of data in our industry and the world is exploding. Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. In an Organization, there are multiple types of documents collected from the different sources. This documents that needs to be accessible immediately; documents that needs to be accessed within a few seconds or minutes; and documents that is accessed in frequently. While these types of documents play different roles within an organization, each is valuable. These different types of documents require different kinds of storage solutions. For handling of such heterogeneous file format we use Hadoop. In Hadoop, storage of different documents is provided by HDFS (Hadoop Distributed File System). Also in educational organization, documents categorization is one of the most important tasks. Availability of a document and need of providing a category to a document motivated for implementing this project.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] A Plausibility Description Logic for Handling Information Sources with Heterogeneous Data Representation Formats
    Luigi Palopoli
    Giorgio Terracina
    Domenico Ursino
    Annals of Mathematics and Artificial Intelligence, 2003, 39 : 385 - 430
  • [22] A Plausibility Description Logic for handling information sources with heterogeneous data representation formats
    Palopoli, L
    Terracina, G
    Ursino, D
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2003, 39 (04) : 385 - 430
  • [23] GRAPHICAL FORMATS AND FILE TRANSFER
    BACON, D
    ELECTRONICS WORLD & WIRELESS WORLD, 1991, 97 (1667): : 738 - 742
  • [24] Taxonomy of Image file formats
    Abu-Taieh, Evon
    El-Haj, Abdullah
    Abu-Tayeh, Alia
    El-Sheikh, Asim
    Ghatasheh, Nazeeh
    2013 FOURTH INTERNATIONAL CONFERENCE ON E-LEARNING "BEST PRACTICES IN MANAGEMENT, DESIGN AND DEVELOPMENT OF E-COURSES: STANDARDS OF EXCELLENCE AND CREATIVITY, 2013, : 74 - 81
  • [25] File Formats - Characterization and Validation
    Shala, Lavderim
    Shala, Ahmet
    IFAC PAPERSONLINE, 2016, 49 (29): : 253 - 258
  • [26] Medical Image File Formats
    Larobina, Michele
    Murino, Loredana
    JOURNAL OF DIGITAL IMAGING, 2014, 27 (02) : 200 - 206
  • [27] Dataset for file fragment classification of audio file formats
    Atieh Khodadadi
    Mehdi Teimouri
    BMC Research Notes, 12
  • [28] Medical Image File Formats
    Michele Larobina
    Loredana Murino
    Journal of Digital Imaging, 2014, 27 : 200 - 206
  • [29] Coping with file formats on the Internet
    Costea, I
    Guenther, U
    Nicolescu, R
    COMPUTER COMMUNICATIONS, 1998, 20 (16) : 1437 - 1447
  • [30] Dataset for file fragment classification of image file formats
    Reyhane Fakouri
    Mehdi Teimouri
    BMC Research Notes, 12