pith. sign in

arxiv: 2508.15647 · v2 · pith:ID3DTAWMnew · submitted 2025-08-21 · 💻 cs.DC

CausalMesh: A Formally Verified Causally Consistent Distributed Cache with Support for Client Migration

Pith reviewed 2026-05-21 22:07 UTC · model grok-4.3

classification 💻 cs.DC
keywords causal consistencydistributed cachingclient migrationformal verificationserverless computingDafnytransactional consistency
0
0 comments X

The pith

CausalMesh maintains causal consistency in distributed caches even when clients migrate between servers without coordination or aborts for basic operations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

CausalMesh addresses the problem of inconsistent data views when clients fetch data from different cache nodes after migrating, which is common in cloud and serverless applications. It does this by introducing a protocol that tracks causal dependencies in a coordination-free manner. The system supports abort-free reads, writes, and read transactions during migrations, with transactional causal consistency available at the expense of possible aborts. This is important for maintaining application correctness in dynamic environments like stateful workflows. The protocol has been formally verified using Dafny to ensure its correctness.

Core claim

CausalMesh is the first cache system to support coordination-free, abort-free read/write operations and read transactions when clients migrate across multiple servers, while also supporting read-write transactional causal consistency at the cost of abort-freedom, and it has been formally verified in Dafny.

What carries the argument

A migration-aware causal dependency tracking protocol that uses observable client migrations and network message ordering to maintain consistency without additional coordination.

If this is right

  • Applications like stateful serverless workflows can run correctly across migrating functions without seeing inconsistent states.
  • The system achieves lower latency and higher throughput than prior proposals for consistent caching.
  • Read-write transactions are possible under migration, though they may require aborts in some cases.
  • Formal verification provides high assurance of the protocol's adherence to causal consistency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar techniques might apply to other distributed systems where processes move between nodes, such as in edge computing.
  • Performance gains could make causal consistency more practical for high-throughput cloud services.
  • Extending the approach to stronger consistency models might be feasible if migration observability is maintained.

Load-bearing premise

The protocol assumes that the system can observe client migrations and that the network delivers messages in an order that permits tracking causal dependencies without extra coordination.

What would settle it

Demonstrating a scenario where a migrating client observes a violation of causal consistency, such as reading a value that should depend on a prior write not yet visible, would falsify the correctness claim.

Figures

Figures reproduced from arXiv: 2508.15647 by Haoran Zhang, Sebastian Angel, Shuai Mu, Vincent Liu, Zihao Zhang.

Figure 1
Figure 1. Figure 1: Anomalies rate of a two-function workflow, where the second function reads the data written by the first. The workflow runs on AWS Lambda and using DynamoDB Accelerator (DAX) as the cache. There are no anomalies when utilizing a single cache node, but it lacks scalability. provides strong consistency but does not scale, or (ii) a cluster of caches such as Amazon DynamoDB Accelerator (DAX) that scales well … view at source ↗
Figure 2
Figure 2. Figure 2: illustrates the architecture of CausalMesh. The architecture consists of four components: Serverless Platform. The serverless platform acts as the runtime environment for user applications. It orchestrates the workflow and dispatches functions to available workers. Databases. The database stores user data. CausalMesh supports any database, as long as the database allows custom conflict resolution policies … view at source ↗
Figure 3
Figure 3. Figure 3: Pseudo-code for Dependency integration. To reduce the size of metadata, it only contains the nearest depen￾dencies, meaning that if x → y (x happens before y, y depends on x) and y → z, z’s dependency will contain y but not x. Dependencies can be merged using the same mechanism as vector clocks. 4 The Dual Cache A core component of CausalMesh is its dual cache. The dual cache is what makes CausalMesh coord… view at source ↗
Figure 4
Figure 4. Figure 4: CausalMesh’s Server APIs. The first three APIs (Client*) are used in CausalMesh’s client library. ServerWrite is called by other CausalMesh servers via RPC to propagate writes. Note that users do not interact directly with these functions. 5.2 Write Path Clients send their writes to the cache server they are connected to, which is typically co-located in a serverless environment. Each write is initially sa… view at source ↗
Figure 5
Figure 5. Figure 5: Propagation chain in a three-server setup. The circled numbers show the order of propagation in the chain starting from S0. round; in the second round, the write is simply forwarded. When the write reaches the tail in the chain, it is integrated, along with its dependencies, into that server’s C-cache [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: presents the pseudo-code for the read transaction in the client library. If the client reads a key that it has written before, similar to read operations, the client library merges the result from the server with its own write. However, the client’s own writes may 1 # self is a client 2 def read_txn (self , keys ) : 3 values , vcs = ClientReadTxn ( keys , self . deps ) 4 res = [] 5 for k , v , vc in zip ( … view at source ↗
Figure 6
Figure 6. Figure 6: Pseudo-code for CausalMesh’s Server. Each cache server serves read requests independently without consulting other servers or going through the propagation chain, making reads in CausalMesh coordination-free. 5.4 Read Transactions CausalMesh supports causally-consistent read transactions. A trans￾actional read request includes a set of keys. When the cache server receives such requests, it integrates the d… view at source ↗
Figure 8
Figure 8. Figure 8: Example of two client c1 and c2 in a three-server setup. The arrows between servers are data propagation. The figure only shows the second round of propagation for simplicity. The arrows from and to the server are reads and writes, respectively. Figure showsc2 read c1’s y at S2 and integrates both x and y after migrating to S1 [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: Pseudo-code for CausalMesh-TCC’s Server. transactions, specifically when dealing with access-control lists (ACLs), as well as read transactions across multiple serverless func￾tions. To tackle this limitation, we propose an extension of Causal￾Mesh called CausalMesh-TCC, which provides Transactional Causal Consistency (TCC) [3, 43] as a stronger consistency level. In Causal￾Mesh-TCC, each workflow is trea… view at source ↗
Figure 11
Figure 11. Figure 11: Comparison between CausalMesh, CausalMesh-TCC, HydroCache-Con, and HydroCache-Opt. N is the number of servers. Unknown ReadSet means that the read set does not need to be known ahead of time, which is needed for supporting dynamic workflows. HydroCache-Opt’s Unknown ReadSet field is No∗ because it supports partially dynamic workflows (§12). In HydroCache and FaaSTCC, writes become visible after a refresh … view at source ↗
Figure 13
Figure 13. Figure 13: CPU and memory usage of cache servers in the micro￾benchmark. The request rate is 1000 requests per second. Our sys￾tem uses a single thread, while HydroCache and FaaSTCC use two threads. Metadata is the additional data required by each protocol to ensure correctness. the tail if the workflow does not require read/write transactions within a function or cross-function transactions. 9.3 Resource Overhead T… view at source ↗
Figure 12
Figure 12. Figure 12: Median and tail response time and throughput of Causal￾Mesh, CausalMesh-TCC, HydroCache-Con, and HydroCache-Opt in the micro-benchmark. paper [66]. The first two functions in the workflow read three keys, while the last function writes to a single key. The keys are sampled from a pool of 1,000,000 keys, following a Zipfian distribution with a coefficient of 1.0. The value is an 8-byte string. Results [PI… view at source ↗
Figure 14
Figure 14. Figure 14: The histogram with y-axis on the left depicts the throughput as we vary the number of servers. The line plot with y-axis on the right shows the normalized throughput by dividing the throughput by the number of servers. metadata grows over time. FaaSTCC reduces a great amount of metadata by delegating the job of assigning versions to the under￾lying storage system. 9.4 Effect of the Number of Caches To eva… view at source ↗
Figure 16
Figure 16. Figure 16: Relationship between the number of servers and vis￾ibility, as measured by the inconsistency window [7]. Marginal inconsistencies indicate the increase in delay that the system expe￾riences when an additional server is added. We evaluate a mixed workload, consisting of 50% ComposeRe￾view and 50% ReadReview. ComposeReview generates a review for a random user and movie, and then saves the review ID to the p… view at source ↗
Figure 17
Figure 17. Figure 17: A causal consistency violation under single-round prop￾agation. 1 # s is the old state of a cache server 2 # s' is the new state of a cache server 3 # m is the received message 4 # sm is the sent message 5 predicate ClientRead (s , m , s', sm ) 6 requires m is Read 7 { 8 var ( icache ', ccache ') := integrate (s. icache , s. ccache , m . deps ); 9 && s' == s .( 10 icache := icache ', 11 ccache := ccache '… view at source ↗
Figure 18
Figure 18. Figure 18: The TLA-style action predicate for the ClientRead func￾tion in [PITH_FULL_IMAGE:figures/full_fig_p013_18.png] view at source ↗
read the original abstract

Cloud applications often insert a caching lay\-er in front of a database in order to reduce I/O latency and improve throughput. One complication occurs when a client fetches some data from one cache node, then migrates to another (e.g., due to failures, load balancing, or client mobility), where it fetches the remaining data. If the data in the cache nodes is inconsistent, the client could observe states that undermine the application's correctness. One example of a situation where this is common is stateful serverless workflows, which consist of multiple serverless functions that access state in a remote database. In serverless, functions in the same workflow may be scheduled to different nodes with different caches, resulting in the migration pattern described above -- the same client (the workflow) reads some data from one cache and other data from another. To address this issue, this paper presents CausalMesh, a novel approach to causally consistent distributed caching in environments where computations may migrate between machines. CausalMesh is the first cache system to support coordination-free, abort-free read/write operations and read transactions when clients migrate across multiple servers. CausalMesh also supports read-write transactional causal consistency in the presence of client migration, but at the cost of abort-freedom. Our experimental evaluation shows that CausalMesh has lower latency and higher throughput than existing proposals. Finally, we have formally verified the correctness of \sys's protocol in Dafny.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents CausalMesh, a causally consistent distributed cache designed for environments with client migrations (e.g., serverless workflows). It claims to be the first system supporting coordination-free, abort-free read/write operations and read transactions across migrations, while also offering read-write transactional causal consistency (with aborts). The protocol is formally verified for correctness in Dafny, and experiments report lower latency and higher throughput than existing proposals.

Significance. If the central claims hold, the work is significant for practical distributed systems and serverless computing, where client or function migrations are routine. The machine-checked Dafny proofs provide strong evidence for the consistency invariants, and the performance results address real latency concerns in caching layers. This combination of formal guarantees and empirical gains could inform future cache designs.

major comments (2)
  1. [Approach and Verification sections] The migration handling logic (described in the approach and verified in Dafny): the central claim of coordination-free causal consistency across migrations depends on observable client handoffs and network delivery order. The Dafny model appears to idealize these as reliable FIFO channels with instantaneous detection; the paper must explicitly state whether the verified invariants transfer to asynchronous networks with possible reordering or delayed observability, as this is load-bearing for deployment claims.
  2. [Evaluation section] § on experimental evaluation: the performance comparison to baselines should include details on how migration frequency and network conditions were modeled, to substantiate the throughput/latency gains under the same assumptions used in the proof.
minor comments (2)
  1. [Abstract] Abstract contains a formatting artifact ('lay- er'); this and similar typographical issues should be cleaned up.
  2. [Protocol description] Notation for causal metadata propagation during migration could be clarified with a small example or diagram to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major comment below and indicate planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Approach and Verification sections] The migration handling logic (described in the approach and verified in Dafny): the central claim of coordination-free causal consistency across migrations depends on observable client handoffs and network delivery order. The Dafny model appears to idealize these as reliable FIFO channels with instantaneous detection; the paper must explicitly state whether the verified invariants transfer to asynchronous networks with possible reordering or delayed observability, as this is load-bearing for deployment claims.

    Authors: We agree that the network assumptions require explicit clarification. The Dafny model formalizes client handoffs and message delivery using reliable FIFO channels with instantaneous observability, which is a standard abstraction used to prove causal consistency invariants in the literature. The verified properties hold under these assumptions. Our implementation assumes an ordered transport (e.g., TCP) for migration handoff messages. We will revise the verification section to state these modeling choices explicitly and discuss their scope, noting that arbitrary reordering would require additional mechanisms such as sequence numbers to preserve the guarantees. This change will better align the formal claims with deployment considerations. revision: yes

  2. Referee: [Evaluation section] § on experimental evaluation: the performance comparison to baselines should include details on how migration frequency and network conditions were modeled, to substantiate the throughput/latency gains under the same assumptions used in the proof.

    Authors: We appreciate the request for additional experimental details. In the evaluation, client migrations were triggered at rates ranging from 1% to 10% of operations, and network conditions were simulated using latency distributions (10–100 ms) and low packet-loss rates representative of serverless environments. These parameters were chosen to exercise the migration path while remaining consistent with the FIFO delivery assumptions in the formal model. We will expand the evaluation section to document these parameters, the simulator configuration, and their relationship to the verified protocol. revision: yes

Circularity Check

0 steps flagged

No circularity: protocol and invariants verified externally in Dafny

full rationale

The paper introduces a new caching protocol whose correctness is established by a machine-checked Dafny proof rather than by any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation. The derivation chain consists of protocol rules plus invariants that are checked against an external formal model; this verification supplies independent evidence and does not reduce the claimed properties to the inputs by construction. Assumptions about observable migrations and network ordering are explicit model parameters, not hidden equivalences that collapse the result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard distributed-systems assumptions about network behavior and observable migrations rather than new fitted parameters or invented entities.

axioms (1)
  • domain assumption Client migrations are observable and the network model permits tracking of causal dependencies without extra coordination messages.
    Invoked to enable the migration handling logic while preserving coordination-free operation.

pith-pipeline@v0.9.0 · 5796 in / 1193 out tokens · 37682 ms · 2026-05-21T22:07:44.089812+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

72 extracted references · 72 canonical work pages

  1. [1]

    Palette load balancing: Locality hints for serverless functions

    Mania Abdi, Samuel Ginzburg, Xiayue Charles Lin, Jose Faleiro, Gohar Irfan Chaudhry, Inigo Goiri, Ricardo Bianchini, Daniel S Berger, and Rodrigo Fonseca. Palette load balancing: Locality hints for serverless functions. In Proceedings of the ACM European Conference on Computer Systems (EuroSys) , 2023

  2. [2]

    Firecracker: Lightweight virtualization for serverless applications

    Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. Firecracker: Lightweight virtualization for serverless applications. InProceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) , 2020

  3. [3]

    Cure: Strong semantics meets high availability and low latency

    Deepthi Devaki Akkoorath, Alejandro Z Tomsic, Manuel Bravo, Zhongmiao Li, Tyler Crain, Annette Bieniusa, Nuno Preguiça, and Marc Shapiro. Cure: Strong semantics meets high availability and low latency. In International Conference on Distributed Computing Systems (ICDCS) , 2016

  4. [4]

    Chainreaction: a causal+ consis- tent datastore based on chain replication

    Sérgio Almeida, João Leitão, and Luís Rodrigues. Chainreaction: a causal+ consis- tent datastore based on chain replication. In Proceedings of the ACM European Conference on Computer Systems (EuroSys) , 2013

  5. [5]

    Bolt-on causal consistency

    Peter Bailis, Ali Ghodsi, Joseph M Hellerstein, and Ion Stoica. Bolt-on causal consistency. In Proceedings of the ACM SIGMOD Conference (SIGMOD) , 2013

  6. [6]

    Practi replication

    Nalini Moti Belaramani, Michael Dahlin, Lei Gao, Amol Nayate, Arun Venkatara- mani, Praveen Yalagandula, and Jiandan Zheng. Practi replication. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2006

  7. [7]

    Eventual consistency: How soon is eventual? an evaluation of amazon s3’s consistency behavior

    David Bermbach and Stefan Tai. Eventual consistency: How soon is eventual? an evaluation of amazon s3’s consistency behavior. InWorkshop on Middleware for Service Oriented Computing (MW4SOC) , 2011

  8. [8]

    Bernstein

    Philip A. Bernstein. A technical perspective of CausalMesh. SIGMOD Record, 54(1), March 2025

  9. [9]

    Cypress: input size-sensitive container provisioning and request scheduling for serverless platforms

    Vivek M Bhasi, Jashwant Raj Gunasekaran, Aakash Sharma, Mahmut Taylan Kandemir, and Chita Das. Cypress: input size-sensitive container provisioning and request scheduling for serverless platforms. InProceedings of the ACM Symposium on Cloud Computing (SOCC) , 2022

  10. [10]

    Durable functions: semantics for state- ful serverless

    Sebastian Burckhardt, Chris Gillum, David Justo, Konstantinos Kallas, Connor McMahon, and Christopher S Meiklejohn. Durable functions: semantics for state- ful serverless. Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA) , 2021

  11. [11]

    https://cloudlab.us

    Cloudlab. https://cloudlab.us

  12. [12]

    Pnuts: Yahoo!’s hosted data serving platform.Proceedings of the Interna- tional Conference on Very Large Data Bases (VLDB) , 2008

    Brian F Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. Pnuts: Yahoo!’s hosted data serving platform.Proceedings of the Interna- tional Conference on Very Large Data Bases (VLDB) , 2008

  13. [13]

    https://azure.microsoft.com/en-us/products/cosmos-db/

    cosmosdb. https://azure.microsoft.com/en-us/products/cosmos-db/

  14. [14]

    https://www.couchbase.com/

    couchbase. https://www.couchbase.com/

  15. [15]

    https://docs.aws.amazon.com/amazon dynamodb/latest/developerguide/DAX.consistency.html

    Dax and dynamodb consistency models. https://docs.aws.amazon.com/amazon dynamodb/latest/developerguide/DAX.consistency.html

  16. [16]

    Leonardo Mendonça de Moura and Nikolaj S. Bjørner. Z3: an efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS) , volume 4963 of Lecture Notes in Computer Science , pages 337–340, 2008

  17. [17]

    https://github.com/delimitrou/DeathStarBench/

    DeathStarBench. https://github.com/delimitrou/DeathStarBench/

  18. [18]

    Dynamo: Amazon’s highly available key-value store

    Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakula- pati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon’s highly available key-value store. Proceedings of the ACM Symposium on Operating Systems Principles (SOSP) , 2007

  19. [19]

    Orbe: Scalable causal consistency using dependency matrices and physical clocks

    Jiaqing Du, Sameh Elnikety, Amitabha Roy, and Willy Zwaenepoel. Orbe: Scalable causal consistency using dependency matrices and physical clocks. InProceedings of the ACM Symposium on Cloud Computing (SOCC) , 2013

  20. [20]

    Gentlerain: Cheap and scalable causal consistency with physical clocks

    Jiaqing Du, Călin Iorgulescu, Amitabha Roy, and Willy Zwaenepoel. Gentlerain: Cheap and scalable causal consistency with physical clocks. In Proceedings of the ACM Symposium on Cloud Computing (SOCC) , 2014

  21. [21]

    https: //github.com/jonhoo/evmap

    evmap: A lock-free, eventually consistent, concurrent multi-value map. https: //github.com/jonhoo/evmap

  22. [22]

    An open-source benchmark suite for microser- vices and their hardware-software implications for cloud & edge systems

    Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, Kelvin Hu, Meghna Pancholi, Yuan He, Brett Clancy, Chris Colen, Fukang Wen, Catherine Leung, Siyuan Wang, Leon Zaruvinsky, Mateo Espinosa, Rick Lin, Zhongling Liu, Jake Padilla, and Christina Delimitrou. An open-source ...

  23. [23]

    https://grpc.io/

    grpc. https://grpc.io/

  24. [24]

    Fifer: Tackling resource underuti- lization in the serverless era

    Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan C Nachiappan, Mahmut Taylan Kandemir, and Chita R Das. Fifer: Tackling resource underuti- lization in the serverless era. In Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware), 2020

  25. [25]

    Lorch, Bryan Parno, Michael Lowell Roberts, Srinath T

    Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael Lowell Roberts, Srinath T. V. Setty, and Brian Zill. Ironfleet: Proving practical distributed systems correct. In ACM Symposium on Operating Systems Principles (SOSP), 2015

  26. [26]

    Hellerstein, Jose Faleiro, Joseph E

    Joseph M. Hellerstein, Jose Faleiro, Joseph E. Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. Serverless computing: One step forward, two steps back. In Conference on Innovative Data Systems Research (CIDR), 2019

  27. [27]

    Zookeeper: Wait-free coordination for internet-scale systems

    Patrick Hunt, Mahadev Konar, Flavio P Junqueira, and Benjamin Reed. Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings of the USENIX Annual Technical Conference (ATC), 2010

  28. [28]

    Lifting the veil on meta’s microservice architecture: Analyses of topology and request workflows

    Darby Huye, Yuri Shkuro, and Raja R Sambasivan. Lifting the veil on meta’s microservice architecture: Analyses of topology and request workflows. In Pro- ceedings of the USENIX Annual Technical Conference (ATC) , 2023

  29. [29]

    Boki: Stateful serverless computing with shared logs

    Zhipeng Jia and Emmett Witchel. Boki: Stateful serverless computing with shared logs. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), 2021. Haoran Zhang, Zihao Zhang, Shuai Mu, Sebastian Angel, and Vincent Liu

  30. [30]

    Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices

    Zhipeng Jia and Emmett Witchel. Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021

  31. [31]

    Occupy the cloud: Distributed computing for the 99%

    Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the ACM Symposium on Cloud Computing (SOCC) , 2017

  32. [32]

    Centralized core- granular scheduling for serverless functions

    Kostis Kaffes, Neeraja J Yadwadkar, and Christos Kozyrakis. Centralized core- granular scheduling for serverless functions. In Proceedings of the ACM Sympo- sium on Cloud Computing (SOCC) , 2019

  33. [33]

    Hermod: principled and practical scheduling for serverless functions

    Kostis Kaffes, Neeraja J Yadwadkar, and Christos Kozyrakis. Hermod: principled and practical scheduling for serverless functions. In Proceedings of the ACM Symposium on Cloud Computing (SOCC) , 2022

  34. [34]

    Pocket: Elastic ephemeral storage for serverless analyt- ics

    Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. Pocket: Elastic ephemeral storage for serverless analyt- ics. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018

  35. [35]

    How long does AWS Lambda keep your idle functions around before a cold start? https://acloudguru.com/blog/engineering/how-long-does-aws-lambda-keep- your-idle-functions-around-before-a-cold-start

  36. [36]

    The temporal logic of actions

    Leslie Lamport. The temporal logic of actions. ACM Trans. Program. Lang. Syst. , 16(3):872–923, 1994

  37. [37]

    Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers

    Leslie Lamport. Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. 2002

  38. [38]

    Time, clocks, and the ordering of events in a distributed system

    Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. In Communications of the ACM, 2019

  39. [39]

    Dafny: An automatic program verifier for functional correct- ness

    K Rustan M Leino. Dafny: An automatic program verifier for functional correct- ness. In International conference on logic for programming artificial intelligence and reasoning, pages 348–370, 2010

  40. [40]

    Tlc model checker, 2021

    Leslie. Tlc model checker, 2021

  41. [41]

    The serverless computing survey: A technical primer for design architecture

    Zijun Li, Linsong Guo, Jiagan Cheng, Quan Chen, BingSheng He, and Minyi Guo. The serverless computing survey: A technical primer for design architecture. ACM Computing Surveys, 2022

  42. [42]

    Liu, Shadi Noghabi, Sebastian Burckhardt, and Amit Levy

    David H. Liu, Shadi Noghabi, Sebastian Burckhardt, and Amit Levy. Doing more with less: Orchestrating serverless applications without an orchestrator. In Pro- ceedings of the USENIX Symposium on Networked Systems Design and Implemen- tation (NSDI), 2023

  43. [43]

    Don’t settle for eventual: Scalable causal consistency for wide-area storage with cops

    Wyatt Lloyd, Michael J Freedman, Michael Kaminsky, and David G Andersen. Don’t settle for eventual: Scalable causal consistency for wide-area storage with cops. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), 2011

  44. [44]

    Stronger semantics for low-latency geo-replicated storage

    Wyatt Lloyd, Michael J Freedman, Michael Kaminsky, and David G Andersen. Stronger semantics for low-latency geo-replicated storage. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) , 2013

  45. [45]

    Characterizing microservice dependency and performance: Alibaba trace analysis

    Shutian Luo, Huanle Xu, Chengzhi Lu, Kejiang Ye, Guoyao Xu, Liping Zhang, Yu Ding, Jian He, and Chengzhong Xu. Characterizing microservice dependency and performance: Alibaba trace analysis. In Proceedings of the ACM Symposium on Cloud Computing (SOCC) , 2021

  46. [46]

    Faastcc: efficient transac- tional causal consistency for serverless computing

    Taras Lykhenko, Rafael Soares, and Luis Rodrigues. Faastcc: efficient transac- tional causal consistency for serverless computing. In Proceedings of the ACM/I- FIP/USENIX International Middleware Conference (Middleware) , 2021

  47. [47]

    Consistency, availability

    Prince Mahajan, Lorenzo Alvisi, and Mike Dahlin. Consistency, availability. Tech- nical report, and convergence. Technical Report TR-11-22, Univ. Texas at Austin, 2011

  48. [48]

    I can’t believe it’s not causal! scalable causal consistency with no slowdown cascades

    Syed Akbar Mehdi, Cody Littley, Natacha Crooks, Lorenzo Alvisi, Nathan Bronson, and Wyatt Lloyd. I can’t believe it’s not causal! scalable causal consistency with no slowdown cascades. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) , 2017

  49. [49]

    https://www.mongodb.com/

    mongodb. https://www.mongodb.com/

  50. [50]

    Zanzibar: Google’s consistent, global authorization system

    Ruoming Pang, Ramon Caceres, Mike Burrows, Zhifeng Chen, Pratik Dave, Nathan Germer, Alexander Golynski, Kevin Graney, Nina Kang, Lea Kissner, et al. Zanzibar: Google’s consistent, global authorization system. InProceedings of the USENIX Annual Technical Conference (ATC), 2019

  51. [51]

    Flexible update propagation for weakly consistent replication

    Karin Petersen, Mike J Spreitzer, Douglas B Terry, Marvin M Theimer, and Alan J Demers. Flexible update propagation for weakly consistent replication. In Pro- ceedings of the ACM Symposium on Operating Systems Principles (SOSP) , 1997

  52. [52]

    Shuffling, fast and slow: Scalable analytics on serverless infrastructure

    Qifan Pu, Shivaram Venkataraman, and Ion Stoica. Shuffling, fast and slow: Scalable analytics on serverless infrastructure. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) , 2019

  53. [53]

    https://redis.io/

    redis. https://redis.io/

  54. [54]

    Faa$t: A transparent auto-scaling cache for serverless applications

    Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. Faa$t: A transparent auto-scaling cache for serverless applications. InProceedings of the ACM Symposium on Cloud Computing (SOCC) , 2021

  55. [55]

    What serverless computing is and should become: The next phase of cloud computing

    Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J Yadwadkar, Raluca Ada Popa, Joseph E Gonzalez, Ion Stoica, and David A Patterson. What serverless computing is and should become: The next phase of cloud computing. Communications of the ACM, 2021

  56. [56]

    Serverless computing: a survey of opportunities, challenges, and applications

    Hossein Shafiei, Ahmad Khonsari, and Payam Mousavi. Serverless computing: a survey of opportunities, challenges, and applications. ACM Computing Surveys, 2022

  57. [57]

    An efficient implementation of vector clocks

    Mukesh Singhal and Ajay Kshemkalyani. An efficient implementation of vector clocks. Information Processing Letters, 1992

  58. [58]

    Cloudburst: stateful functions-as-a-service

    Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph E Gonzalez, Joseph M Hellerstein, and Alexey Tumanov. Cloudburst: stateful functions-as-a-service. Proceedings of the International Conference on Very Large Data Bases (VLDB) , 2020

  59. [59]

    Twine: A unified cluster management system for shared infrastructure

    Chunqiang Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, et al. Twine: A unified cluster management system for shared infrastructure. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI) , 2020

  60. [60]

    Implementation of cluster-wide logical clock and causal consistency in mongodb

    Misha Tyulenev, Andy Schwerin, Asya Kamsky, Randolph Tan, Alyson Cabral, and Jack Mulrow. Implementation of cluster-wide logical clock and causal consistency in mongodb. In Proceedings of the ACM SIGMOD Conference (SIGMOD) , 2019

  61. [61]

    Chain replication for supporting high throughput and availability

    Robbert Van Renesse and Fred B Schneider. Chain replication for supporting high throughput and availability. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI) , 2004

  62. [62]

    Infinicache: Exploiting ephemeral serverless functions to build a cost-effective memory cache

    Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, Dim- itrios Skourtis, Vasily Tarasov, Feng Yan, and Yue Cheng. Infinicache: Exploiting ephemeral serverless functions to build a cost-effective memory cache. In Pro- ceedings of the USENIX Conference on File and Storage Technologies (FAST) , 2020

  63. [63]

    https: //github.com/giltene/wrk2

    wrk2: A constant throughput, correct latency recording variant of wrk. https: //github.com/giltene/wrk2

  64. [64]

    Anna: A kvs for any scale

    Chenggang Wu, Jose M Faleiro, Yihan Lin, and Joseph M Hellerstein. Anna: A kvs for any scale. IEEE Transactions on Knowledge and Data Engineering (TKDE) , 2019

  65. [65]

    Autoscaling tiered cloud storage in anna

    Chenggang Wu, Vikram Sreekanti, and Joseph M Hellerstein. Autoscaling tiered cloud storage in anna. Proceedings of the International Conference on Very Large Data Bases (VLDB), 2019

  66. [66]

    Transactional causal consistency for serverless computing

    Chenggang Wu, Vikram Sreekanti, and Joseph M Hellerstein. Transactional causal consistency for serverless computing. In Proceedings of the ACM SIGMOD Conference (SIGMOD), 2020

  67. [67]

    Design and evaluation of a continuous consistency model for repli- cated services

    Haifeng Yu. Design and evaluation of a continuous consistency model for repli- cated services. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI) , 2000

  68. [68]

    Restructuring serverless computing with data-centric function orchestration

    Minchen Yu, Tingjia Cao, Wei Wang, and Ruichuan Chen. Restructuring serverless computing with data-centric function orchestration. arXiv:2109.13492, 2021. https: //arxiv.org/abs/2109.13492

  69. [69]

    Write fast, read in the past: Causal consistency for client-side applications

    Marek Zawirski, Nuno Preguiça, Sérgio Duarte, Annette Bieniusa, Valter Balegas, and Marc Shapiro. Write fast, read in the past: Causal consistency for client-side applications. In Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware), 2015

  70. [70]

    Fault-tolerant and transactional stateful serverless workflows

    Haoran Zhang, Adney Cardoza, Peter Baile Chen, Sebastian Angel, and Vincent Liu. Fault-tolerant and transactional stateful serverless workflows. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020

  71. [71]

    Causalmesh: A causal cache for stateful serverless computing

    Haoran Zhang, Shuai Mu, Sebastian Angel, and Vincent Liu. Causalmesh: A causal cache for stateful serverless computing. The Proceedings of the VLDB Endowment (PVLDB), 17(13), 2024

  72. [72]

    Caerus: Nimble task scheduling for serverless analytics

    Hong Zhang, Yupeng Tang, Anurag Khandelwal, Jingrong Chen, and Ion Stoica. Caerus: Nimble task scheduling for serverless analytics. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI) , 2021