GeoGauss: Strongly Consistent and Light-Coordinated OLTP for Geo-Replicated SQL Database
Pith reviewed 2026-05-24 09:58 UTC · model grok-4.3
The pith
GeoGauss achieves strong consistency in geo-replicated SQL databases through a full-replica multi-master architecture and an epoch-based merge rule.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GeoGauss adopts a full replica multi-master architecture in which every node serves as a master. A multi-master OCC protocol merges updates arriving from different masters according to an epoch-based delta state merge rule. This unification of replication and concurrent transaction processing, together with optimistic asynchronous execution, produces strong consistency under a light-coordinated protocol while permitting additional concurrency through weak isolation.
What carries the argument
The multi-master OCC with epoch-based delta state merge rule, which merges concurrent updates from multiple masters and handles replication in the same step.
If this is right
- Cross-shard transactions avoid the multiple round-trip acknowledgments required by two-phase commit.
- Clients write locally to any master instead of sending every update across regions.
- Higher concurrency is possible because weak isolation is accepted where it meets application needs.
- The same mechanism that replicates data also resolves conflicts, removing a separate replication layer.
Where Pith is reading between the lines
- The same merge rule might be tested on workloads with higher write contention to see where the optimistic path breaks down.
- If the epoch length can be tuned without losing consistency, the design could be applied to clusters spanning more than a handful of regions.
- The unification of replication and concurrency control suggests a path for reducing the number of distinct protocols that a geo-distributed engine must maintain.
Load-bearing premise
The epoch-based delta state merge rule can keep replicas strongly consistent across regions without the heavy coordination steps used in sharded master-follower systems.
What would settle it
A geo-distributed TPC-C run in which cross-region conflicting transactions produce either a consistency violation or throughput no higher than a single-master baseline.
Figures
read the original abstract
Multinational enterprises conduct global business that has a demand for geo-distributed transactional databases. Existing state-of-the-art databases adopt a sharded master-follower replication architecture. However, the single-master serving mode incurs massive cross-region writes from clients, and the sharded architecture requires multiple round-trip acknowledgments (e.g., 2PC) to ensure atomicity for cross-shard transactions. These limitations drive us to seek yet another design choice. In this paper, we propose a strongly consistent OLTP database GeoGauss with full replica multi-master architecture. To efficiently merge the updates from different master nodes, we propose a multi-master OCC that unifies data replication and concurrent transaction processing. By leveraging an epoch-based delta state merge rule and the optimistic asynchronous execution, GeoGauss ensures strong consistency with light-coordinated protocol and allows more concurrency with weak isolation, which are sufficient to meet our needs. Our geo-distributed experimental results show that GeoGauss achieves 7.06X higher throughput and 17.41X lower latency than the state-of-the-art geo-distributed database CockroachDB on the TPC-C benchmark.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes GeoGauss, a strongly consistent geo-replicated SQL OLTP database using a full-replica multi-master architecture. It introduces a multi-master OCC protocol unified with data replication via an epoch-based delta state merge rule and optimistic asynchronous execution to achieve strong consistency with only light coordination while permitting weak isolation for higher concurrency. Experiments on TPC-C report 7.06X higher throughput and 17.41X lower latency versus CockroachDB.
Significance. If the consistency argument and performance claims hold, the work offers a concrete alternative to sharded master-follower designs for multi-region transactional workloads by reducing cross-region writes and 2PC overhead; the unification of replication and concurrency control is a notable architectural contribution.
major comments (2)
- [Protocol design section] The central claim that the epoch-based delta state merge rule plus multi-master OCC guarantees strong consistency (replica agreement) under optimistic execution is load-bearing yet lacks a formal argument or invariant proof in the protocol description; without it the unification of replication and concurrency control cannot be verified from the given design sketch.
- [Evaluation section] Table or figure reporting TPC-C results: the 7.06X throughput and 17.41X latency claims are presented without workload parameters (number of warehouses, regions, client placement), CockroachDB configuration details, run counts, or error bars, preventing assessment of the cross-system comparison.
minor comments (2)
- [Abstract] The abstract states that weak isolation 'suffices to meet our needs' but does not name the isolation level or the workloads for which it is adequate.
- [Design] Notation for the delta state merge rule should be defined before its first use to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of our consistency argument and experimental details. We address each major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Protocol design section] The central claim that the epoch-based delta state merge rule plus multi-master OCC guarantees strong consistency (replica agreement) under optimistic execution is load-bearing yet lacks a formal argument or invariant proof in the protocol description; without it the unification of replication and concurrency control cannot be verified from the given design sketch.
Authors: We agree that a formal argument would make the consistency guarantee easier to verify. In the revised manuscript we will add a new subsection to the protocol design section that states the key invariants (e.g., epoch ordering, delta-state commutativity under the merge rule) and provides a proof sketch showing that the combination of multi-master OCC and epoch-based merging ensures replica agreement and strong consistency even when transactions execute optimistically and asynchronously. revision: yes
-
Referee: [Evaluation section] Table or figure reporting TPC-C results: the 7.06X throughput and 17.41X latency claims are presented without workload parameters (number of warehouses, regions, client placement), CockroachDB configuration details, run counts, or error bars, preventing assessment of the cross-system comparison.
Authors: We acknowledge that the current evaluation section omits several parameters required for reproducibility. In the revised manuscript we will augment the TPC-C results with the exact workload parameters (number of warehouses, number of regions, client placement), CockroachDB configuration settings used for comparison, the number of independent runs performed, and error bars (or standard deviations) on the reported throughput and latency numbers. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper proposes a multi-master OCC architecture with epoch-based delta state merge for geo-replicated OLTP, claiming strong consistency via light coordination and weak isolation. These claims are supported by implementation details and direct TPC-C benchmark comparisons to CockroachDB (7.06X throughput, 17.41X latency), not by any equations, fitted parameters presented as predictions, or self-referential definitions. No load-bearing steps reduce to inputs by construction; the work is a systems design validated externally through experiments.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/DimensionForcing.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
epoch-based delta state merge rule ... epoch is a short period of time (e.g., 10 ms) ... DeltaCRDTMerge ... ACI property
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
multi-master OCC that unifies data replication and concurrent transaction processing
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
- [1]
- [2]
- [3]
-
[4]
Aria: A Fast and Practical Deterministic OLTP Database
2022. Aria: A Fast and Practical Deterministic OLTP Database. https://github.com/luyi0619/aria
work page 2022
- [5]
- [6]
- [7]
-
[8]
ExtremeDB: Cluster Distributed Database System
2022. ExtremeDB: Cluster Distributed Database System. https://www.mcobject.com/cluster/
work page 2022
-
[9]
2022. FaunaDB. https://fauna.com/
work page 2022
- [10]
- [11]
-
[12]
gRPC: A high performance, open source universal RPC framework
2022. gRPC: A high performance, open source universal RPC framework. https://grpc.io/
work page 2022
-
[13]
2022. MySQL Tungsten. https://www.continuent.com/products/tungsten-replicator
work page 2022
-
[14]
MySQL’s primary-secondary replication
2022. MySQL’s primary-secondary replication. https://dev.mysql.com/
work page 2022
- [15]
- [16]
-
[17]
2022. Protocol Buffers. https://developers.google.com/protocol-buffers
work page 2022
- [18]
-
[19]
Riak: Enterprise NoSQL Database
2022. Riak: Enterprise NoSQL Database. https://riak.com/
work page 2022
-
[20]
Semi-synchronous replication at facebook
2022. Semi-synchronous replication at facebook. http://yoshinorimatsunobu.blogspot.com/
work page 2022
- [21]
-
[22]
YugabyteDB: Distributed SQL Database
2022. YugabyteDB: Distributed SQL Database. https://www.yugabyte.com/
work page 2022
-
[23]
ZeroMQ: An open-source universal messaging library
2022. ZeroMQ: An open-source universal messaging library. https://zeromq.org/
work page 2022
-
[24]
Daniel J Abadi and Jose M Faleiro. 2018. An overview of deterministic database systems. Commun. ACM 61, 9 (2018), 78–88
work page 2018
-
[25]
Michael Abebe, Brad Glasbergen, and Khuzaima Daudjee. 2020. DynaMast: Adaptive dynamic mastering for replicated systems. In 2020 IEEE 36th International Conference on Data Engineering (ICDE) . IEEE, 1381–1392
work page 2020
-
[26]
Michael Abebe, Brad Glasbergen, and Khuzaima Daudjee. 2020. MorphoSys: automatic physical design metamorphosis for distributed database systems. Proceedings of the VLDB Endowment 13, 13 (2020), 3573–3587
work page 2020
-
[27]
Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero. 2015. Efficient state-based crdts by delta-mutation. In Interna- tional Conference on Networked Systems . Springer, 62–76
work page 2015
-
[28]
Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero. 2018. Delta state replicated data types. J. Parallel and Distrib. Comput. 111 (2018), 162–173
work page 2018
-
[29]
Peter Alvaro, Neil Conway, Joseph M. Hellerstein, and David Maier. 2017. Blazes: Coordination Analysis and Placement for Distributed Programs. ACM Trans. Database Syst. 42, 4, Article 23 (oct 2017), 31 pages
work page 2017
-
[30]
Peter Alvaro, Neil Conway, Joseph M Hellerstein, and William R Marczak. 2011. Consistency Analysis in Bloom: a CALM and Collected Approach.. In CIDR. 249–260
work page 2011
-
[31]
Mohammad Javad Amiri, Divyakant Agrawal, and Amr El Abbadi. 2019. Caper: a cross-application permissioned blockchain. Proceedings of the VLDB Endowment 12, 11 (2019), 1385–1398
work page 2019
-
[32]
Mohammad Javad Amiri, Divyakant Agrawal, and Amr El Abbadi. 2021. Sharper: Sharding permissioned blockchains over network clusters. In Proceedings of the 2021 International Conference on Management of Data . 76–88
work page 2021
-
[33]
Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, et al . 2018. Hyperledger fabric: a distributed operating system for permissioned blockchains. In Proceedings of the thirteenth EuroSys conference . 1–15
work page 2018
-
[34]
H. Avni, A. Aliev, O. Amor, A. Avitzur, I. Bronshtein, E. Ginot, S. Goikhman, E. Levy, Lu Levy, I., F., and L. Mishali
-
[35]
Proceedings of the VLDB Endowment 13, 12 (2020), 3099–3111
Industrial-Strength OLTP Using Main Memory and Many Cores. Proceedings of the VLDB Endowment 13, 12 (2020), 3099–3111
work page 2020
-
[36]
Franklin, Ali Ghodsi, Joseph M
Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2014. Coordination Avoidance in Database Systems. Proc. VLDB Endow. 8, 3 (2014), 185–196
work page 2014
-
[37]
Peter David Bailis. 2015. Coordination avoidance in distributed databases . University of California, Berkeley
work page 2015
-
[38]
Michael J Cahill, Uwe Röhm, and Alan D Fekete. 2009. Serializable isolation for snapshot databases. ACM Transactions on Database Systems (TODS) 34, 4 (2009), 1–42
work page 2009
-
[39]
Prima Chairunnanda, Khuzaima Daudjee, and M Tamer Özsu. 2014. ConfluxDB: Multi-master replication for partitioned snapshot isolation databases. Proceedings of the VLDB Endowment 7, 11 (2014), 947–958
work page 2014
-
[40]
Marczak, Peter Alvaro, Joseph M
Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier. 2012. Logic and Lattices for Distributed Programming. In Proceedings of the Symposium on Cloud Computing (SoCC ’12) . 1:1–1:14. Proc. ACM Manag. Data, Vol. 1, No. 1, Article 62. Publication date: May 2023. 62:26 Weixing Zhou et al
work page 2012
-
[41]
Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC ’10) . 143–154
work page 2010
-
[42]
James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, et al. 2013. Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems (TOCS) 31, 3 (2013), 1–22
work page 2013
-
[43]
Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin, and Beng Chin Ooi. 2019. Towards scaling blockchain systems via sharding. In Proceedings of the 2019 international conference on management of data . 123–140
work page 2019
-
[44]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon’s highly available key-value store. ACM SIGOPS operating systems review 41, 6 (2007), 205–220
work page 2007
-
[45]
Sameh Elnikety, Steven Dropsho, and Fernando Pedone. 2006. Tashkent: Uniting Durability with Transaction Ordering for High-Performance Scalable Database Replication. InProceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006 (EuroSys ’06) (EuroSys ’06). 117–130
work page 2006
-
[46]
Jose M Faleiro, Daniel J Abadi, and Joseph M Hellerstein. 2017. High performance transactions via early write visibility. Proceedings of the VLDB Endowment 10, 5 (2017)
work page 2017
-
[47]
Ant group. 2022. OceanBase. https://open.oceanbase.com/
work page 2022
- [48]
-
[49]
Rachael Harding, Dana Van Aken, Andrew Pavlo, and Michael Stonebraker. 2017. An evaluation of distributed concurrency control. Proceedings of the VLDB Endowment 10, 5 (2017), 553–564
work page 2017
-
[50]
Jelle Hellings and Mohammad Sadoghi. 2021. Byshard: Sharding in a byzantine environment. Proceedings of the VLDB Endowment 14, 11 (2021), 2230–2243
work page 2021
-
[51]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35–40
work page 2010
-
[52]
Yi Lu, Xiangyao Yu, Lei Cao, and Samuel Madden. 2020. Aria: a fast and practical deterministic OLTP database. Proceedings of the VLDB Endowment 13, 12 (2020), 2047–2060
work page 2020
-
[53]
Yi Lu, Xiangyao Yu, Lei Cao, and Samuel Madden. 2021. Epoch-Based Commit and Replication in Distributed OLTP Databases. Proc. VLDB Endow. 14, 5 (2021), 743–756
work page 2021
-
[54]
Yi Lu, Xiangyao Yu, and Samuel Madden. 2019. STAR: Scaling Transactions through Asymmetric Replication. Proc. VLDB Endow. 12, 11 (2019), 1316–1329
work page 2019
-
[55]
Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review (2008), 21260
work page 2008
-
[56]
Pincap. 2022. TiDB. https://pingcap.com/products/tidb
work page 2022
-
[57]
Nuno Preguiça. 2018. Conflict-free replicated data types: An overview. arXiv preprint arXiv:1806.10254 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[58]
Thamir Qadah, Suyash Gupta, and Mohammad Sadoghi. 2020. Q-Store: Distributed, Multi-partition Transactions via Queue-oriented Execution and Communication. In Proceedings of the 23rd International Conference on Extending Database Technology (EDBT). 73–84
work page 2020
-
[59]
Thamir M Qadah and Mohammad Sadoghi. 2018. Quecc: A queue-oriented, control-free concurrency architecture. In Proceedings of the 19th International Middleware Conference . 13–25
work page 2018
-
[60]
Ian Rae, Eric Rollins, Jeff Shute, Sukhdeep Sodhi, and Radek Vingralek. 2013. Online, Asynchronous Schema Change in F1. Proc. VLDB Endow. 6, 11 (aug 2013), 1045–1056
work page 2013
- [61]
-
[62]
Kun Ren, Dennis Li, and Daniel J. Abadi. 2019. SLOG: Serializable, Low-Latency, Geo-Replicated Transactions. Proc. VLDB Endow. 12, 11 (jul 2019), 1747–1761
work page 2019
-
[63]
Kun Ren, Alexander Thomson, and Daniel J Abadi. 2014. An evaluation of the advantages and disadvantages of deterministic database systems. Proceedings of the VLDB Endowment 7, 10 (2014), 821–832
work page 2014
-
[64]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. A comprehensive study of convergent and commutative replicated data types . Ph. D. Dissertation. Inria–Centre Paris-Rocquencourt; INRIA
work page 2011
-
[65]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. Conflict-free replicated data types. In Symposium on Self-Stabilizing Systems . 386–400
work page 2011
-
[66]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. Conflict-free Replicated Data Types. In Proceedings of the Symposium on Self-stabilizing Systems (SSS ’11) . 386–400
work page 2011
-
[67]
Chrysoula Stathakopoulou, Matej Pavlovic, and Marko Vukolić. 2022. State machine replication scalability made simple. In Proceedings of the Seventeenth European Conference on Computer Systems . 17–33
work page 2022
-
[68]
Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, et al. 2020. Cockroachdb: The resilient geo-distributed sql database. In Proceedings Proc. ACM Manag. Data, Vol. 1, No. 1, Article 62. Publication date: May 2023. GeoGauss: Strongly Consistent and Light-Coordinate...
work page 2020
-
[69]
Alexander Thomson and Daniel J Abadi. 2010. The case for determinism in database systems. Proceedings of the VLDB Endowment 3, 1-2 (2010), 70–80
work page 2010
-
[70]
Alexander Thomson and Daniel J Abadi. 2015. CalvinFS: Consistent WAN Replication and Scalable Metadata Manage- ment for Distributed File Systems. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST 15). 1–14
work page 2015
-
[71]
Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J Abadi. 2012. Calvin: fast distributed transactions for partitioned database systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data . 1–12
work page 2012
-
[72]
Alexandre Verbitski, Anurag Gupta, Debanjan Saha, Murali Brahmadesam, Kamal Gupta, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvili, and Xiaofeng Bao. 2017. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGM...
work page 2017
-
[73]
Chenggang Wu, Jose M Faleiro, Yihan Lin, and Joseph M Hellerstein. 2019. Anna: A kvs for any scale.IEEE Transactions on Knowledge and Data Engineering 33, 2 (2019), 344–358
work page 2019
-
[74]
Chenggang Wu, Vikram Sreekanti, and Joseph M Hellerstein. 2019. Autoscaling tiered cloud storage in Anna. Proceedings of the VLDB Endowment 12, 6 (2019), 624–638
work page 2019
-
[75]
Chang Yao, Divyakant Agrawal, Gang Chen, Qian Lin, Beng Chin Ooi, Weng-Fai Wong, and Meihui Zhang. 2016. Exploiting single-threaded model in multi-core in-memory systems. IEEE Transactions on Knowledge and Data Engineering 28, 10 (2016), 2635–2650. Received July 2022; revised October 2022; accepted November 2022 Proc. ACM Manag. Data, Vol. 1, No. 1, Artic...
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.