pith. sign in

arxiv: 1907.11465 · v1 · pith:AM4CWJNBnew · submitted 2019-07-26 · 💻 cs.DC

ServerMix: Tradeoffs and Challenges of Serverless Data Analytics

Pith reviewed 2026-05-24 15:37 UTC · model grok-4.3

classification 💻 cs.DC
keywords serverless computingdata analyticshybrid systemstrade-offscloud computingServermixdisaggregationisolation
0
0 comments X

The pith

Serverless data analytics workloads often require hybrid Servermix systems to manage performance trade-offs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes three fundamental trade-offs in today's serverless computing model when applied to data analytics. Relaxing disaggregation increases performance but reduces elasticity. Relaxing isolation boosts performance at the cost of security. Relaxing simple scheduling improves performance but loses sub-second activations. These trade-offs imply that analytics applications will likely adopt hybrid systems combining serverless and serverful components, which the authors term Servermix. This matters because it frames how developers must design and manage cloud analytics systems going forward.

Core claim

Today's serverless computing presents important limitations for data analytics workloads due to three fundamental trade-offs. By relaxing disaggregation, isolation, and simple scheduling, it is possible to increase overall computing performance, but at the expense of elasticity, security, or sub-second activations respectively. The consequence is that analytics applications may end up embracing hybrid systems composed of serverless and serverful components, called Servermix.

What carries the argument

The three trade-offs between disaggregation and elasticity, isolation and security, and simple scheduling and sub-second activations, which together drive the need for Servermix hybrid architectures.

If this is right

  • Most applications can be categorized as Servermix systems.
  • Hybrid serverless and serverful components become necessary for many analytics tasks.
  • Managing the identified trade-offs poses major challenges for projects like CloudButton.
  • Analytics applications will need to combine serverless and serverful elements rather than rely on pure serverless.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Cloud platforms may need new interfaces that let applications dynamically shift between serverless and serverful execution modes.
  • Tooling that automatically partitions analytics jobs across the two models could reduce developer effort.
  • Future serverless runtimes might incorporate limited forms of state or longer-lived execution to narrow the performance gap without full relaxation of the model constraints.

Load-bearing premise

That relaxing disaggregation, isolation, and simple scheduling increases overall computing performance at the expense of elasticity, security, or sub-second activations respectively for data analytics workloads.

What would settle it

A fully serverless data analytics workload that achieves equivalent or better performance than any hybrid Servermix configuration while preserving elasticity, security, and sub-second activations.

Figures

Figures reproduced from arXiv: 1907.11465 by Ana Juan Ferrer, David Breitgand, Gil Vernik, Marc S\'anchez-Artigas, Pedro Garc\'ia-L\'opez, Peter Pietzuch, Pierre Sutra, Simon Shillaker, Tristan Tarrant.

Figure 1
Figure 1. Figure 1: Tradeoffs Weakening disaggregation to exploit function and data locality can be useful to improve performance. However, it also means to decrease the scale-out capacity of cloud functions and complicate function scheduling in order to meet user SLOs. The more you move to the left, the closer you are to serverful computing or running VMs or clusters in the datacenter. With isolation the effect is similar. S… view at source ↗
read the original abstract

Serverless computing has become very popular today since it largely simplifies cloud programming. Developers do not need to longer worry about provisioning or operating servers, and they pay only for the compute resources used when their code is run. This new cloud paradigm suits well for many applications, and researchers have already begun investigating the feasibility of serverless computing for data analytics. Unfortunately, today's serverless computing presents important limitations that make it really difficult to support all sorts of analytics workloads. This paper first starts by analyzing three fundamental trade-offs of today's serverless computing model and their relationship with data analytics. It studies how by relaxing disaggregation, isolation, and simple scheduling, it is possible to increase the overall computing performance, but at the expense of essential aspects of the model such as elasticity, security, or sub-second activations, respectively. The consequence of these trade-offs is that analytics applications may well end up embracing hybrid systems composed of serverless and serverful components, which we call Servermix in this paper. We will review the existing related work to show that most applications can be actually categorized as Servermix. Finally, this paper will introduce the major challenges of the CloudButton research project to manage these trade-offs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper analyzes three fundamental trade-offs in current serverless computing—relaxing disaggregation (to gain performance at cost of elasticity), isolation (at cost of security), and simple scheduling (at cost of sub-second activations)—and argues that these imply data analytics workloads will adopt hybrid serverless-serverful systems, which the authors term Servermix. It supports the claim via a categorization of related work showing most applications fit this hybrid pattern and outlines open challenges from the CloudButton project.

Significance. If the trade-off relationships hold as described, the paper provides a useful conceptual synthesis that frames the likely trajectory of serverless data analytics toward hybrid architectures. The explicit introduction and literature categorization of Servermix offers a lens for organizing prior work and identifying design challenges, which could help guide research even without new empirical results.

major comments (1)
  1. [Abstract] Abstract: The central implication that the three trade-offs 'consequently' lead analytics applications to embrace Servermix is presented as a direct consequence of the analysis, yet the manuscript provides no formal model, quantitative bounds, or falsifiable prediction linking the specific relaxations to adoption rates or performance thresholds; this makes the load-bearing step from trade-off description to the hybrid-systems conclusion more of an assertion than a derived result.
minor comments (2)
  1. [Abstract] Abstract: The phrasing 'Developers do not need to longer worry' is grammatically incorrect and should be revised to 'no longer'.
  2. [Abstract] Abstract: The sentence 'this paper first starts by analyzing' is redundant; 'this paper analyzes' is sufficient and improves readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the abstract. We will revise the wording to more accurately reflect the nature of the argument.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central implication that the three trade-offs 'consequently' lead analytics applications to embrace Servermix is presented as a direct consequence of the analysis, yet the manuscript provides no formal model, quantitative bounds, or falsifiable prediction linking the specific relaxations to adoption rates or performance thresholds; this makes the load-bearing step from trade-off description to the hybrid-systems conclusion more of an assertion than a derived result.

    Authors: We agree that the abstract presents the implication in stronger terms than the supporting analysis warrants. The manuscript is a position paper whose central claim is grounded in the three trade-off analyses plus a categorization of existing literature showing that most data-analytics applications already combine serverless and serverful components. No formal model or quantitative thresholds are provided. To address the concern we will revise the abstract to replace 'consequently' with phrasing such as 'suggest that' or 'indicate that' analytics applications are likely to adopt hybrid Servermix architectures. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a conceptual argument: serverless trade-offs in disaggregation, isolation, and scheduling lead analytics workloads toward hybrid Servermix systems, supported by a review of external related work for categorization. No equations, fitted parameters, self-definitional constructs, or load-bearing self-citations appear; the derivation relies on explicit trade-off analysis and independent literature review rather than reducing to its own inputs by construction. The claim is framed observationally ('may well end up') without quantitative predictions or uniqueness theorems.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the domain assumption that the three listed trade-offs are fundamental to serverless models and that literature shows most analytics apps are hybrid; no free parameters or new entities with independent evidence are introduced beyond naming the hybrid pattern.

axioms (1)
  • domain assumption Serverless computing presents important limitations that make it difficult to support all sorts of analytics workloads due to disaggregation, isolation, and scheduling constraints.
    Invoked in the abstract as the starting point for the trade-off analysis.
invented entities (1)
  • Servermix no independent evidence
    purpose: Name for hybrid systems composed of serverless and serverful components in data analytics.
    Introduced to categorize patterns observed in the reviewed related work.

pith-pipeline@v0.9.0 · 5776 in / 1198 out tokens · 52274 ms · 2026-05-24T15:37:51.549047+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 3 internal anchors

  1. [1]

    Aws lambda limits

    Amazon. Aws lambda limits. https://docs.aws.amazon.com/lambda/latest/dg/limits.html/, 2019

  2. [2]

    Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein

    Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. Encoding, fast and slow: Low-latency video processing using thousands of tiny threads. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI’17), pages 363–376, 2017

  3. [3]

    Occupy the cloud: Distributed computing for the 99%

    Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC’17), pages 445–451, 2017

  4. [4]

    Serverless computing: One step forward, two steps back.Conference on Innovative Data Systems Research (CIDR’19), 2019

    Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. Serverless computing: One step forward, two steps back.Conference on Innovative Data Systems Research (CIDR’19), 2019

  5. [5]

    Cloud Programming Simplified: A Berkeley View on Serverless Computing

    Eric Jonas et al. Cloud programming simplified: A berkeley view on serverless computing. https://arxiv.org/abs/1902.03383, 2019

  6. [6]

    numpywren: serverless linear algebra

    Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, and Jonathan Ragan-Kelley. numpywren: serverless linear algebra. CoRR, abs/1810.09679, 2018

  7. [7]

    Shuffling, fast and slow: Scalable analytics on serverless infrastructure

    Qifan Pu, Shivaram Venkataraman, and Ion Stoica. Shuffling, fast and slow: Scalable analytics on serverless infrastructure. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI’19), pages 193–206, 2019

  8. [8]

    https://redis.io/

    Redis. https://redis.io/

  9. [9]

    A case for serverless machine learning

    Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew M Zhang, and Randy Katz. A case for serverless machine learning. In Workshop on Systems for ML and Open Source Software at NeurIPS, 2018. 13 A PREPRINT - JULY 29, 2019

  10. [10]

    Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker

    Peter X. Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. Network requirements for resource disaggregation. In 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16), pages 249–264, 2016

  11. [11]

    Data-driven serverless functions for object storage

    Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, and Gerard París. Data-driven serverless functions for object storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference, pages 121–133. ACM, 2017

  12. [12]

    SAND: Towards high-performance serverless computing

    Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and V olker Hilt. SAND: Towards high-performance serverless computing. In2018 USENIX Annual Technical Conference (ATC’18), pages 923–935, 2018

  13. [13]

    SOCK: Rapid task provisioning with serverless-optimized containers

    Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. SOCK: Rapid task provisioning with serverless-optimized containers. In 2018 USENIX Annual Technical Conference (ATC’18), pages 57–70, 2018

  14. [14]

    Schuff, Ben L

    Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with webassembly. In 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17), pages 185–200, 2017

  15. [15]

    https://aws.amazon.com/serverless/, 2019

    Amazon AWS Serverless Definition. https://aws.amazon.com/serverless/, 2019

  16. [16]

    Efficient memory disaggregation with infiniswap

    Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G Shin. Efficient memory disaggregation with infiniswap. In 14th USENIX Conference on Networked Systems Design and Implementation (NSDI’17), pages 649–667, 2017

  17. [17]

    Making serverless computing more serverless

    Zaid Al-Ali et al. Making serverless computing more serverless. In IEEE 11th International Conference on Cloud Computing (CLOUD’18), pages 456–459, 2018

  18. [18]

    Gibson, and Christos Faloutsos

    Erik Riedel, Garth A. Gibson, and Christos Faloutsos. Active storage for large-scale data mining and multimedia. In 24rd International Conference on Very Large Data Bases (VLDB’98), pages 62–73, 1998

  19. [19]

    Istvan, D

    Z. Istvan, D. Sidler, and G. Alonso. Active pages 20 years later: Active storage for the cloud. IEEE Internet Computing, 22(4):6–14, 2018

  20. [20]

    https://aws.amazon.com/blogs/aws/ firecracker-lightweight-virtualization-for-serverless-computing/ , 2019

    Firecracker: lightweight virtualization for serverless computing. https://aws.amazon.com/blogs/aws/ firecracker-lightweight-virtualization-for-serverless-computing/ , 2019

  21. [21]

    https://cloud.google.com/blog/products/gcp/ open-sourcing-gvisor-a-sandboxed-container-runtime , 2018

    Open-sourcing gVisor, a sandboxed container runtime. https://cloud.google.com/blog/products/gcp/ open-sourcing-gvisor-a-sandboxed-container-runtime , 2018

  22. [22]

    https://blog.cloudflare.com/ webassembly-on-cloudflare-workers/ , 2018

    WebAssembly on CloudFlare Workers. https://blog.cloudflare.com/ webassembly-on-cloudflare-workers/ , 2018

  23. [23]

    Data-driven serverless functions for object storage

    Josep Sampe, Marc Sanchez-Artigas, Pedro Garcia Lopez, and Gerard Paris. Data-driven serverless functions for object storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware ’17), pages 121–133, 2017

  24. [24]

    Serverless Data Analytics with Flint

    Youngbin Kim and Jimmy Lin. Serverless data analytics with Flint. CoRR, abs/1803.06354, 2018

  25. [25]

    From laptop to lambda: Outsourcing everyday jobs to thousands of transient functional containers

    Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. From laptop to lambda: Outsourcing everyday jobs to thousands of transient functional containers. In 2019 USENIX Annual Technical Conference (ATC’19), pages 475–488, 2019

  26. [26]

    Pocket: Elastic ephemeral storage for serverless analytics

    Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. Pocket: Elastic ephemeral storage for serverless analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18), pages 427–444, 2018

  27. [27]

    Comparison of faas orchestration systems

    Pedro García López, Marc Sánchez-Artigas, Gerard París, Daniel Barcelona Pons, Álvaro Ruiz Ollobarren, and David Arroyo Pinto. Comparison of faas orchestration systems. In 2018 IEEE/ACM International Conference on Utility and Cloud Computing Companion (UCC Companion), pages 148–153, 2018

  28. [28]

    http://cloudbutton.eu, 2019

    H2020 CloudButton, Serverless Data Analytics. http://cloudbutton.eu, 2019

  29. [29]

    https://fission.io/workflows/, 2018

    Fission Flows. https://fission.io/workflows/, 2018

  30. [30]

    https://argoproj.github.io/, 2018

    Argo Workflows. https://argoproj.github.io/, 2018

  31. [31]

    https://github.com/apache/airflow, 2018

    Apache Airflow. https://github.com/apache/airflow, 2018

  32. [32]

    https://github.com/brigadecore/brigade, 2018

    Brigade: Event-based Scripting for Kubernetes. https://github.com/brigadecore/brigade, 2018

  33. [33]

    https://github.com/ibm-functions/composer, 2018

    IBM Functions Composer. https://github.com/ibm-functions/composer, 2018

  34. [34]

    https://aws.amazon.com/step-functions/, 2016

    AWS Step Functions. https://aws.amazon.com/step-functions/, 2016. 14 A PREPRINT - JULY 29, 2019

  35. [35]

    https://docs.microsoft.com/en-us/azure/azure-functions/ durable-functions-overview, 2018

    Azure Durable Functions. https://docs.microsoft.com/en-us/azure/azure-functions/ durable-functions-overview, 2018

  36. [36]

    My vm is lighter (and safer) than your container

    Filipe Manco, Costin Lupu, Florian Schmidt, Jose Mendes, Simon Kuenzer, Sumit Sati, Kenichi Yasukata, Costin Raiciu, and Felipe Huici. My vm is lighter (and safer) than your container. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP ’17), pages 218–233, 2017

  37. [37]

    The POSTGRES Next Generation Database Management System

    Michael Stonebraker and Greg Kemnitz. The POSTGRES Next Generation Database Management System. Commun. ACM, 34(10):78–92, October 1991

  38. [38]

    Junqueira, and Benjamin Reed

    Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In 2010 USENIX Annual Technical Conference (ATC’10), 2010

  39. [39]

    Data consistency properties and the trade-offs in commercial cloud storage: the consumers’ perspective

    Hiroshi Wada, Alan Fekete, Liang Zhao, Kevin Lee, and Anna Liu. Data consistency properties and the trade-offs in commercial cloud storage: the consumers’ perspective. In Fifth Biennial Conference on Innovative Data Systems Research (CIDR’11), pages 134–143, 2011

  40. [40]

    Hagit Attiya and Jennifer L. Welch. Sequential consistency versus linearizability (extended abstract). In Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’91), pages 304–315, 1991

  41. [41]

    The part-time parliament

    Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, May 1998

  42. [42]

    Ousterhout

    Diego Ongaro and John K. Ousterhout. In search of an understandable consensus algorithm. In 2014 USENIX Conference on USENIX Annual Technical Conference (ATC’14), pages 305–319, 2014

  43. [43]

    Putting the Micro Back in Microservice

    Sol Boucher, Anuj Kalia, David G Andersen, and Michael Kaminsky. Putting the Micro Back in Microservice. 2018 USENIX Annual Technical Conference (ATC ’18), pages 645–650, 2018

  44. [44]

    Schuff, Ben L

    Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with WebAssembly. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17), pages 185–200, 2017

  45. [45]

    Fastly Labs - Terrarium

    Fastly. Fastly Labs - Terrarium. 15