ServerMix: Tradeoffs and Challenges of Serverless Data Analytics
Pith reviewed 2026-05-24 15:37 UTC · model grok-4.3
The pith
Serverless data analytics workloads often require hybrid Servermix systems to manage performance trade-offs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Today's serverless computing presents important limitations for data analytics workloads due to three fundamental trade-offs. By relaxing disaggregation, isolation, and simple scheduling, it is possible to increase overall computing performance, but at the expense of elasticity, security, or sub-second activations respectively. The consequence is that analytics applications may end up embracing hybrid systems composed of serverless and serverful components, called Servermix.
What carries the argument
The three trade-offs between disaggregation and elasticity, isolation and security, and simple scheduling and sub-second activations, which together drive the need for Servermix hybrid architectures.
If this is right
- Most applications can be categorized as Servermix systems.
- Hybrid serverless and serverful components become necessary for many analytics tasks.
- Managing the identified trade-offs poses major challenges for projects like CloudButton.
- Analytics applications will need to combine serverless and serverful elements rather than rely on pure serverless.
Where Pith is reading between the lines
- Cloud platforms may need new interfaces that let applications dynamically shift between serverless and serverful execution modes.
- Tooling that automatically partitions analytics jobs across the two models could reduce developer effort.
- Future serverless runtimes might incorporate limited forms of state or longer-lived execution to narrow the performance gap without full relaxation of the model constraints.
Load-bearing premise
That relaxing disaggregation, isolation, and simple scheduling increases overall computing performance at the expense of elasticity, security, or sub-second activations respectively for data analytics workloads.
What would settle it
A fully serverless data analytics workload that achieves equivalent or better performance than any hybrid Servermix configuration while preserving elasticity, security, and sub-second activations.
Figures
read the original abstract
Serverless computing has become very popular today since it largely simplifies cloud programming. Developers do not need to longer worry about provisioning or operating servers, and they pay only for the compute resources used when their code is run. This new cloud paradigm suits well for many applications, and researchers have already begun investigating the feasibility of serverless computing for data analytics. Unfortunately, today's serverless computing presents important limitations that make it really difficult to support all sorts of analytics workloads. This paper first starts by analyzing three fundamental trade-offs of today's serverless computing model and their relationship with data analytics. It studies how by relaxing disaggregation, isolation, and simple scheduling, it is possible to increase the overall computing performance, but at the expense of essential aspects of the model such as elasticity, security, or sub-second activations, respectively. The consequence of these trade-offs is that analytics applications may well end up embracing hybrid systems composed of serverless and serverful components, which we call Servermix in this paper. We will review the existing related work to show that most applications can be actually categorized as Servermix. Finally, this paper will introduce the major challenges of the CloudButton research project to manage these trade-offs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes three fundamental trade-offs in current serverless computing—relaxing disaggregation (to gain performance at cost of elasticity), isolation (at cost of security), and simple scheduling (at cost of sub-second activations)—and argues that these imply data analytics workloads will adopt hybrid serverless-serverful systems, which the authors term Servermix. It supports the claim via a categorization of related work showing most applications fit this hybrid pattern and outlines open challenges from the CloudButton project.
Significance. If the trade-off relationships hold as described, the paper provides a useful conceptual synthesis that frames the likely trajectory of serverless data analytics toward hybrid architectures. The explicit introduction and literature categorization of Servermix offers a lens for organizing prior work and identifying design challenges, which could help guide research even without new empirical results.
major comments (1)
- [Abstract] Abstract: The central implication that the three trade-offs 'consequently' lead analytics applications to embrace Servermix is presented as a direct consequence of the analysis, yet the manuscript provides no formal model, quantitative bounds, or falsifiable prediction linking the specific relaxations to adoption rates or performance thresholds; this makes the load-bearing step from trade-off description to the hybrid-systems conclusion more of an assertion than a derived result.
minor comments (2)
- [Abstract] Abstract: The phrasing 'Developers do not need to longer worry' is grammatically incorrect and should be revised to 'no longer'.
- [Abstract] Abstract: The sentence 'this paper first starts by analyzing' is redundant; 'this paper analyzes' is sufficient and improves readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on the abstract. We will revise the wording to more accurately reflect the nature of the argument.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central implication that the three trade-offs 'consequently' lead analytics applications to embrace Servermix is presented as a direct consequence of the analysis, yet the manuscript provides no formal model, quantitative bounds, or falsifiable prediction linking the specific relaxations to adoption rates or performance thresholds; this makes the load-bearing step from trade-off description to the hybrid-systems conclusion more of an assertion than a derived result.
Authors: We agree that the abstract presents the implication in stronger terms than the supporting analysis warrants. The manuscript is a position paper whose central claim is grounded in the three trade-off analyses plus a categorization of existing literature showing that most data-analytics applications already combine serverless and serverful components. No formal model or quantitative thresholds are provided. To address the concern we will revise the abstract to replace 'consequently' with phrasing such as 'suggest that' or 'indicate that' analytics applications are likely to adopt hybrid Servermix architectures. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents a conceptual argument: serverless trade-offs in disaggregation, isolation, and scheduling lead analytics workloads toward hybrid Servermix systems, supported by a review of external related work for categorization. No equations, fitted parameters, self-definitional constructs, or load-bearing self-citations appear; the derivation relies on explicit trade-off analysis and independent literature review rather than reducing to its own inputs by construction. The claim is framed observationally ('may well end up') without quantitative predictions or uniqueness theorems.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Serverless computing presents important limitations that make it difficult to support all sorts of analytics workloads due to disaggregation, isolation, and scheduling constraints.
invented entities (1)
-
Servermix
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Amazon. Aws lambda limits. https://docs.aws.amazon.com/lambda/latest/dg/limits.html/, 2019
work page 2019
-
[2]
Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. Encoding, fast and slow: Low-latency video processing using thousands of tiny threads. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI’17), pages 363–376, 2017
work page 2017
-
[3]
Occupy the cloud: Distributed computing for the 99%
Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC’17), pages 445–451, 2017
work page 2017
-
[4]
Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. Serverless computing: One step forward, two steps back.Conference on Innovative Data Systems Research (CIDR’19), 2019
work page 2019
-
[5]
Cloud Programming Simplified: A Berkeley View on Serverless Computing
Eric Jonas et al. Cloud programming simplified: A berkeley view on serverless computing. https://arxiv.org/abs/1902.03383, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1902
-
[6]
numpywren: serverless linear algebra
Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, and Jonathan Ragan-Kelley. numpywren: serverless linear algebra. CoRR, abs/1810.09679, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[7]
Shuffling, fast and slow: Scalable analytics on serverless infrastructure
Qifan Pu, Shivaram Venkataraman, and Ion Stoica. Shuffling, fast and slow: Scalable analytics on serverless infrastructure. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI’19), pages 193–206, 2019
work page 2019
- [8]
-
[9]
A case for serverless machine learning
Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew M Zhang, and Randy Katz. A case for serverless machine learning. In Workshop on Systems for ML and Open Source Software at NeurIPS, 2018. 13 A PREPRINT - JULY 29, 2019
work page 2018
-
[10]
Peter X. Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. Network requirements for resource disaggregation. In 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16), pages 249–264, 2016
work page 2016
-
[11]
Data-driven serverless functions for object storage
Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, and Gerard París. Data-driven serverless functions for object storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference, pages 121–133. ACM, 2017
work page 2017
-
[12]
SAND: Towards high-performance serverless computing
Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and V olker Hilt. SAND: Towards high-performance serverless computing. In2018 USENIX Annual Technical Conference (ATC’18), pages 923–935, 2018
work page 2018
-
[13]
SOCK: Rapid task provisioning with serverless-optimized containers
Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. SOCK: Rapid task provisioning with serverless-optimized containers. In 2018 USENIX Annual Technical Conference (ATC’18), pages 57–70, 2018
work page 2018
-
[14]
Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with webassembly. In 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17), pages 185–200, 2017
work page 2017
-
[15]
https://aws.amazon.com/serverless/, 2019
Amazon AWS Serverless Definition. https://aws.amazon.com/serverless/, 2019
work page 2019
-
[16]
Efficient memory disaggregation with infiniswap
Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G Shin. Efficient memory disaggregation with infiniswap. In 14th USENIX Conference on Networked Systems Design and Implementation (NSDI’17), pages 649–667, 2017
work page 2017
-
[17]
Making serverless computing more serverless
Zaid Al-Ali et al. Making serverless computing more serverless. In IEEE 11th International Conference on Cloud Computing (CLOUD’18), pages 456–459, 2018
work page 2018
-
[18]
Gibson, and Christos Faloutsos
Erik Riedel, Garth A. Gibson, and Christos Faloutsos. Active storage for large-scale data mining and multimedia. In 24rd International Conference on Very Large Data Bases (VLDB’98), pages 62–73, 1998
work page 1998
- [19]
-
[20]
Firecracker: lightweight virtualization for serverless computing. https://aws.amazon.com/blogs/aws/ firecracker-lightweight-virtualization-for-serverless-computing/ , 2019
work page 2019
-
[21]
Open-sourcing gVisor, a sandboxed container runtime. https://cloud.google.com/blog/products/gcp/ open-sourcing-gvisor-a-sandboxed-container-runtime , 2018
work page 2018
-
[22]
https://blog.cloudflare.com/ webassembly-on-cloudflare-workers/ , 2018
WebAssembly on CloudFlare Workers. https://blog.cloudflare.com/ webassembly-on-cloudflare-workers/ , 2018
work page 2018
-
[23]
Data-driven serverless functions for object storage
Josep Sampe, Marc Sanchez-Artigas, Pedro Garcia Lopez, and Gerard Paris. Data-driven serverless functions for object storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware ’17), pages 121–133, 2017
work page 2017
-
[24]
Serverless Data Analytics with Flint
Youngbin Kim and Jimmy Lin. Serverless data analytics with Flint. CoRR, abs/1803.06354, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[25]
From laptop to lambda: Outsourcing everyday jobs to thousands of transient functional containers
Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. From laptop to lambda: Outsourcing everyday jobs to thousands of transient functional containers. In 2019 USENIX Annual Technical Conference (ATC’19), pages 475–488, 2019
work page 2019
-
[26]
Pocket: Elastic ephemeral storage for serverless analytics
Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. Pocket: Elastic ephemeral storage for serverless analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18), pages 427–444, 2018
work page 2018
-
[27]
Comparison of faas orchestration systems
Pedro García López, Marc Sánchez-Artigas, Gerard París, Daniel Barcelona Pons, Álvaro Ruiz Ollobarren, and David Arroyo Pinto. Comparison of faas orchestration systems. In 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion), pages 148–153, 2018
work page 2018
-
[28]
H2020 CloudButton, Serverless Data Analytics. http://cloudbutton.eu, 2019
work page 2019
-
[29]
https://fission.io/workflows/, 2018
Fission Flows. https://fission.io/workflows/, 2018
work page 2018
- [30]
-
[31]
https://github.com/apache/airflow, 2018
Apache Airflow. https://github.com/apache/airflow, 2018
work page 2018
-
[32]
https://github.com/brigadecore/brigade, 2018
Brigade: Event-based Scripting for Kubernetes. https://github.com/brigadecore/brigade, 2018
work page 2018
-
[33]
https://github.com/ibm-functions/composer, 2018
IBM Functions Composer. https://github.com/ibm-functions/composer, 2018
work page 2018
-
[34]
https://aws.amazon.com/step-functions/, 2016
AWS Step Functions. https://aws.amazon.com/step-functions/, 2016. 14 A PREPRINT - JULY 29, 2019
work page 2016
-
[35]
https://docs.microsoft.com/en-us/azure/azure-functions/ durable-functions-overview, 2018
Azure Durable Functions. https://docs.microsoft.com/en-us/azure/azure-functions/ durable-functions-overview, 2018
work page 2018
-
[36]
My vm is lighter (and safer) than your container
Filipe Manco, Costin Lupu, Florian Schmidt, Jose Mendes, Simon Kuenzer, Sumit Sati, Kenichi Yasukata, Costin Raiciu, and Felipe Huici. My vm is lighter (and safer) than your container. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP ’17), pages 218–233, 2017
work page 2017
-
[37]
The POSTGRES Next Generation Database Management System
Michael Stonebraker and Greg Kemnitz. The POSTGRES Next Generation Database Management System. Commun. ACM, 34(10):78–92, October 1991
work page 1991
-
[38]
Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In 2010 USENIX Annual Technical Conference (ATC’10), 2010
work page 2010
-
[39]
Hiroshi Wada, Alan Fekete, Liang Zhao, Kevin Lee, and Anna Liu. Data consistency properties and the trade-offs in commercial cloud storage: the consumers’ perspective. In Fifth Biennial Conference on Innovative Data Systems Research (CIDR’11), pages 134–143, 2011
work page 2011
-
[40]
Hagit Attiya and Jennifer L. Welch. Sequential consistency versus linearizability (extended abstract). In Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’91), pages 304–315, 1991
work page 1991
-
[41]
Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, May 1998
work page 1998
-
[42]
Diego Ongaro and John K. Ousterhout. In search of an understandable consensus algorithm. In 2014 USENIX Conference on USENIX Annual Technical Conference (ATC’14), pages 305–319, 2014
work page 2014
-
[43]
Putting the Micro Back in Microservice
Sol Boucher, Anuj Kalia, David G Andersen, and Michael Kaminsky. Putting the Micro Back in Microservice. 2018 USENIX Annual Technical Conference (ATC ’18), pages 645–650, 2018
work page 2018
-
[44]
Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with WebAssembly. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17), pages 185–200, 2017
work page 2017
- [45]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.