MemEFS: A network-aware elastic in-memory runtime distributed file system

2017 
Scientific domains such as astronomy or bioinformatics produce increasingly large amounts of data that need to be analyzed. Such analyses are modeled as scientific workflowsapplications composed of many individual tasks that exhibit data dependencies. Typically, these applications suffer from significant variability in the interplay between achieved parallelism and data footprint. To efficiently tackle the data deluge, cost effective solutions need to be deployed by extending private computing infrastructures with public cloud resources. To achieve this, two key features for such systems need to be addressed: elasticity and network adaptability. The former improves compute resource utilization efficiency, while the latter improves network utilization efficiency, since public clouds suffer from significant bandwidth variability. This paper extends our previous work on MemEFS, an in-memory elastic distributed file system by adding network adaptability. Our results show that MemEFS’ elasticity increases the resource utilization efficiency by up to 65%. Regarding the network adaptation policy, MemEFS achieves up to 50% speedup compared to its network-agnostic counterpart.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    5
    Citations
    NaN
    KQI
    []