A REVIEW ON SMALL FILES IN HADOOP A NOVEL APPROACH TO UNDESTAND SMALL FILES PROBLEM IN HADOOP

2017 
Hadoop is an open source data management system designed for storing and processing large volumes of data, minimum size being 64MB. Storing and processing of Small Files smaller than the minimum block size cannot be efficiently handled by hadoop because Small Files results in lots of seeks and lots of hopping between the datanodes.  A survey on the existing literature has been carried out to analyze the effect / solutions for the Small Files problem in hadoop. This paper presents the same and lists many effective solutions for this problem and further this paper says that there is a need to carry out lot of research on small file problem in order to attain effective and efficient solutions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    1
    Citations
    NaN
    KQI
    []