Applying big data warehousing and visualization techniques on pingER data

2016 
Nowadays, the Internet has turned into a crucial piece of our cutting edge society. It is a stage of exploration, financial development, democratic participation and speech. The operations of the Internet have prompted a huge development and collection of information known as Big Data. Therefore, it is important to monitor and measure the Quality of Service (QoS) of Internet traffic. The SLAC National Accelerator Laboratory started the PingER project in 1995 to measure the End-to-End Internet performance history of servers and routers worldwide. The project involves measurements of the 700 monitored sites in over 160 countries. PingER Monitoring Agents (MAs) ping a list of monitored sites after every 30 minutes to obtain Round Trip Time (RTT) values revealing interesting information about Internet performance (e.g., RTT, jitter, packet loss and unreachability) major events (e.g., fiber cuts, earthquakes, and social upheavals). Thus, the project has collected a vast amount of historical Internet Performance data worldwide since 1995. Currently, the data is stored in flat text files, making it difficult to analyze collectively. In addition, this simplistic format limits the analytical potential of this data. In this paper, we propose an approach to process, store, analyze and visualize PingER data. A Data warehouse is created which combines Hadoop Big Data techniques. The data are processed by using Sci-cumulus MR workflow, stored in HDFS, analyzed by Impala queries and visualized by using Google API's. This approach makes PingER data more accessible and enhances its potential contribution to ongoing research and application development.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []