Integration and Classification of Various Massive Datasets using Big Data Framework

2021 
The integration of different big data platforms and input data from corresponding sources ensure the expected analysis and prediction of massive data with its nature. This paper incorporates the mechanism of combining and configuring data processing frameworks to generate input data, transform and analyze respective data based on its nature and requirement. In this paper, we look at multiple tasks in three different categories. Initially, it involves the classification of a local dataset with a large volume using a machine learning algorithm. Secondly, integration and operation of SQL data generated from Relational Database management systems. And lastly, the analysis of incoming data streams generated as a source from the Twitter app manager into Apache Spark. Based on this experiment the expected result is properly categorized, classified from each local dataset, integrated relational database, and results of most populated hashTags of real-life Twitter post data. Eventually, the experiment of the system depicts and evaluates analysis and combination of technologies for handling big data streams along with corresponding platforms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []