pith. sign in

arxiv: 1803.06354 · v2 · pith:FEHANRVZnew · submitted 2018-03-16 · 💻 cs.DC · cs.DB

Serverless Data Analytics with Flint

classification 💻 cs.DC cs.DB
keywords flintserverlessanalyticsdatadesignprocessingsparkactual
0
0 comments X
read the original abstract

Serverless architectures organized around loosely-coupled function invocations represent an emerging design for many applications. Recent work mostly focuses on user-facing products and event-driven processing pipelines. In this paper, we explore a completely different part of the application space and examine the feasibility of analytical processing on big data using a serverless architecture. We present Flint, a prototype Spark execution engine that takes advantage of AWS Lambda to provide a pure pay-as-you-go cost model. With Flint, a developer uses PySpark exactly as before, but without needing an actual Spark cluster. We describe the design, implementation, and performance of Flint, along with the challenges associated with serverless analytics.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. ServerMix: Tradeoffs and Challenges of Serverless Data Analytics

    cs.DC 2019-07 unverdicted novelty 4.0

    Serverless computing for data analytics involves trade-offs in disaggregation, isolation, and scheduling that push most workloads toward hybrid Servermix architectures.