FaaScinating Resilience for Serverless Function Choreographies in Federated Clouds

2022 
Cloud applications often benefit from deployment on serverless technology Function-as-a-Service (FaaS), which may instantly spawn numerous functions and charges users for the period when serverless functions are running. Maximum benefit is achieved when functions are orchestrated in a workflow or function choreographies (FCs). However, many provider limitations specific for FaaS, such as maximum concurrency or duration often increase the failure rate, which can severely hamper the execution of entire FCs. Current support for resilience is often limited to function retries or try-catch, which are applicable within the same cloud region only. To overcome these limitations, we introduce rAFCL, a middleware platform that maintains reliability of complex FCs in federated clouds. In order to support resilient FC execution under rAFCL, our model creates an alternative strategy for each function based on the required availability specified by the user. Alternative strategies are not restricted to the same cloud region, but may contain alternative functions across five providers, invoked concurrently in a single alternative plan or executed subsequently in multiple alternative plans. With this approach, rAFCL offers flexibility in terms of cost-performance trade-off. We evaluated rAFCL by running three real-life applications across three cloud providers. Experimental results demonstrated that rAFCL outperforms the resilience of AWS Step Functions, increasing the success rate of entire FC by 53.45%, while invoking only 3.94% more functions with zero wasted function invocations. rAFCL significantly improves availability of entire FCs to almost 1 and survives even after massive failures of alternative functions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []