GES DISC Data Recipes in Jupyter Notebooks

2017 
The Earth Science Data and Information System (ESDIS) Project manages twelve Distributed Active Archive Centers (DAACs) which are geographically dispersed across the United States. The DAACs are responsible for ingesting, processing, archiving, and distributing Earth science data produced from various sources (satellites, aircraft, field measurements, etc.). In response to projections of an exponential increase in data production, there has been a recent effort to prototype various DAAC activities in the cloud computing environment. This, in turn, led to the creation of an initiative, called the Cloud Analysis Toolkit to Enable Earth Science (CATEES), to develop a Python software package in order to transition Earth science data processing to the cloud. This project, in particular, supports CATEES and has two primary goals. One, to transition data recipes created by the Goddard Earth Science Data and Information Service Center (GES DISC) into an interactive and educational environment using JupyterNotebooks. Two, to acclimate Earth scientists to cloud computing. To accomplish these goals, we create JupyterNotebooks to compartmentalize the different steps of data analysis and help users obtain and parse data from the command line. We also develop a Docker container, comprised of Jupyter Notebooks, Python dependencies, and command line tools, and configure it into an easy-to-deploy package. The end result is an end-to-end product that simulates the use case of end users working in the cloud computing environment.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []