arxiv: 2604.09589 · v1 · submitted 2026-03-02 · 💻 cs.CC · cs.LO· cs.PL

Recognition: no theorem link

Complexity of Consistency Testing for the Release-Acquire Semantics

R. Govind , S. Krishna , Sanchari Sil , B. Srivathsan

Authors on Pith no claims yet

Pith reviewed 2026-05-15 17:07 UTC · model grok-4.3

classification 💻 cs.CC cs.LOcs.PL

keywords consistency testingrelease-acquire semanticsC11 memory modelweak memory modelsNP-hardnesspolynomial-time algorithmExponential Time Hypothesis

0 comments

The pith

Consistency testing for release-acquire memory is polynomial-time when each location has at most one writer.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that deciding whether an observed sequence of reads and writes is consistent under release-acquire semantics can be done in polynomial time when each memory location is written by only one thread. This restriction does not help for sequential consistency, where the problem remains NP-hard. The result is complemented by hardness: the decision problem becomes NP-complete as soon as two threads can write the same location, and with three writers per location it cannot be solved in 2 to the o(k) times polynomial in n time under the Exponential Time Hypothesis.

Core claim

Consistency testing under the release-acquire variant of the C11 memory model lies in P when every memory location is written by at most one thread. The same problem is NP-hard when two threads may write any given location, and under the Exponential Time Hypothesis it admits no 2^{o(k)} n^{O(1)} algorithm once three writers per location are permitted.

What carries the argument

A polynomial-time graph reachability procedure that exploits the single-writer restriction to construct a total order on writes and check release-acquire synchronization edges without enumerating interleavings.

Load-bearing premise

The release-acquire semantics are taken exactly as defined in the C11 standard and prior literature, without extra constraints on thread or location counts.

What would settle it

An execution with one writer per location on which the claimed polynomial algorithm outputs 'consistent' yet an exhaustive enumeration of possible interleavings shows a violation of release-acquire ordering.

Figures

Figures reproduced from arXiv: 2604.09589 by B. Srivathsan, R. Govind, Sanchari Sil, S. Krishna.

**Figure 2.** Figure 2: From left: rf0 ⊑ rf1 ⊑ rf2. rf0 has a violation of weak-read-coherence: r3 rf−1 0 −−−→ w1 po −→ w2 hb0 −−→ r3. rf1 has a violation of weak-read-coherence: r4 rf−1 1 −−−→ w2 po −→ w3 hb1 −−→ r4. rf2 does not have any more violations of weak-read-coherence. porf-acyclicity is also satisfied by rf2. The underlying partial execution graph is WRA-consistent. 3.2 Algorithm Our algorithm starts from an rf0 which … view at source ↗

**Figure 3.** Figure 3: From left: rf0 ⊑ rf1. rf0 has a violation of weak-readcoherence: r3 rf−1 0 −−−→ w1 po −→ w2 hb −→ r3. rf1 has no weak-read-coherence violation but does not satisfy porfacyclicity: w5 rf1 −−→ r3 po −→ w4 rf −→ r2 po −→ w5. The partial execution graph is not WRA-consistent. From rf i to rf i+1. Pick a violating cycle for weak-read-coherence: ri rf−1 i −−→ wi hbi −−→ w ′ i hbi −−→ ri where all three events … view at source ↗

**Figure 4.** Figure 4: From triangles to violations of weakread-coherence Our idea for the reduction is as follows. We will construct X in such a way that for every r(x, v) there is a unique w(x, v) present in the entire program. Therefore, the rf relation is fixed. If G has a triangle, we will have a shared variable a and a cycle of the form: w(a, 0) po −→ w(a, 1) hb −→ r(a, 0) rf−1 −−→ w(a, 0). This cycle violates weak-read… view at source ↗

**Figure 5.** Figure 5: Threads in the reduction, for i ∈ {1, 2, . . . , k} Thus, if there was a 2 o(k) · n O(1) algorithm for the VM problem, we could solve 3-CNF-SAT in 2 o(k) .(k + m) O(1) time: from φ, construct the VM instance in O(k + m) time, and then use the 2 o(k) · (k + m) O(1) procedure for VM, thereby contradicting ETH. To show the last item above, we make use of what is known in the literature as a range reduction … view at source ↗

**Figure 6.** Figure 6: A partial execution X for a graph G = ({1, 2, 3}, {(1, 2),(2, 1),(1, 3),(3, 1),(2, 3),(3, 2)}). We have (w(a1, 0), r(a1, 0)) ∈ hb violating weak read coherence in X. shows the reduction on a graph with 3 vertices {1, 2, 3}, having edges between every distinct pair of vertices. Correctness and Time Complexity. Assume that G has a triangle (x, y, z). Wlg, let x < y < z. Then w(ax, 1) hb r(ax, 0) as follows: … view at source ↗

**Figure 7.** Figure 7: Violation of (a) relaxed-write-coherence: [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗

**Figure 8.** Figure 8: Threads in the reduction for Relaxed The threads Init1 and Init0 initialize the valuation used to evaluate the formula. For each variable xi corresponding to the Boolean variables of φ, Init1 writes 1 and Init0 writes 0. Txi has a read statement r(vi , 0) followed by r(vi , 1) (which denotes reading true for xi) and sets all clauses containing xi to true, using the statement w(cj , 1) for all j ∈ Sxi . S… view at source ↗

read the original abstract

In a seminal work, Gibbons and Korach studied the complexity of deciding whether an observed sequence of reads and writes of a multi-threaded program admits a sequentially consistent interleaving. They showed the problem to be NP-hard even under strong syntactic restrictions. More recently, Chakraborty et al. considered the problem for weak memory models and proved that NP-hardness remains even when the number of threads, the number of memory locations, and the value domain are all bounded. In this paper we revisit the problem for the release-acquire variants of the C11 memory model. Our main positive result is that consistency testing can be done in polynomial-time when each memory location is written by at most one thread (multiple readers are allowed). Notably, this restriction is already NP-hard for sequential consistency. We complement this upper bound with tight hardness results: the problem is NP-hard when two threads may write to the same location, and allowing three writers per location rules out 2^{o(k)}.n^{O(1)} algorithms under the Exponential Time Hypothesis, where k denotes the number of threads, and n the number of memory operations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows consistency testing for release-acquire is polynomial under single-writer-per-location, a clean separation from the NP-hardness known for sequential consistency, plus tight NP and ETH bounds at two and three writers.

read the letter

The main thing to know is that consistency checking under release-acquire becomes polynomial-time when each location has at most one writer. This stands in direct contrast to the NP-hardness that already holds for sequential consistency under the identical syntactic restriction. The paper pairs this with NP-hardness once two writers per location are allowed and an ETH-based lower bound ruling out 2^{o(k)} n^{O(1)} algorithms at three writers, where k is the number of threads.

Referee Report

0 major / 3 minor

Summary. The paper studies the complexity of deciding whether an observed execution trace is consistent with the release-acquire (RA) fragment of the C11 memory model. The central positive result is a polynomial-time algorithm for consistency testing when each memory location has at most one writer (multiple readers permitted). This is contrasted with the known NP-hardness of the same syntactic restriction under sequential consistency. The paper complements the upper bound with matching hardness: NP-hardness already holds when two threads may write the same location, and an ETH-based lower bound rules out 2^{o(k)} n^{O(1)} algorithms when three writers per location are allowed (k = number of threads, n = number of operations).

Significance. If the results hold, the work supplies a clean complexity map for RA consistency checking that is absent from prior literature. The polynomial-time algorithm under the single-writer restriction is a concrete algorithmic contribution that exploits the RA semantics in a way that does not apply to sequential consistency; the hardness results are tight and obtained via standard parameterized-complexity reductions. The paper therefore both improves the state of the art for weak-memory verification and clarifies why the choice of memory model qualitatively changes the complexity of the consistency problem.

minor comments (3)

[§4] The high-level description of the polynomial-time algorithm (presumably in §4) would be easier to follow if a short pseudocode block or a diagram of the dependency-graph construction were added.
[Theorem 5.3] In the ETH lower-bound statement, the parameter k is defined as the number of threads; it would help to restate this explicitly in the theorem statement rather than only in the surrounding text.
[§2] A few citations to the original C11 RA formalization (e.g., the precise happens-before and release-acquire axioms used) appear only in the introduction; repeating the key definitions in a dedicated preliminary section would improve self-containedness.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. The summary accurately captures the main contributions, and we are glad that the distinction between the release-acquire model and sequential consistency is viewed as a useful clarification for the complexity of consistency checking.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper establishes a polynomial-time algorithm for RA consistency testing under the single-writer-per-location restriction via direct algorithmic construction, and complements it with standard NP-hardness reductions (from known problems) and ETH lower bounds. These steps rely on independent complexity-theoretic techniques and the external C11 RA model definition; no self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear. The contrast with SC hardness is external and does not reduce the central claims to their own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The results rest on the standard definition of release-acquire consistency from the C11 model and basic facts from complexity theory such as NP-completeness and the Exponential Time Hypothesis; no free parameters or new entities are introduced.

axioms (2)

domain assumption Release-acquire semantics defined exactly as in the C11 standard and prior literature on weak memory models
The polynomial algorithm and all reductions are with respect to this fixed model definition.
standard math Standard NP-completeness and ETH assumptions from complexity theory
Used to establish the hardness results for two and three writers.

pith-pipeline@v0.9.0 · 5507 in / 1286 out tokens · 66817 ms · 2026-05-15T17:07:35.639473+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

In: CAV (1)

Agarwal, P., Chatterjee, K., Pathak, S., Pavlogiannis, A., Toman, V.: Stateless model checking under a reads-value-from equivalence. In: CAV (1). Lecture Notes in Computer Science, vol. 12759, pp. 341–366. Springer (2021)

work page 2021
[2]

Ahamad, M., Neiger, G., Burns, J.E., Kohli, P., Hutto, P.W.: Causal mem- ory: Definitions, implementation, and programming. Distrib. Comput.9(1), 37–49 (mar 1995).https://doi.org/10.1007/BF01784241,https://doi.org/10.1007/ BF01784241

work page doi:10.1007/bf01784241 1995
[3]

In: POPL’11

Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concurrency. In: POPL’11. pp. 55–66. ACM (2011).https://doi.org/10.1145/ 1926385.1926394

work page arXiv 2011
[4]

In: POPL

Bouajjani, A., Enea, C., Guerraoui, R., Hamza, J.: On verifying causal consistency. In: POPL. pp. 626–638. ACM (2017)

work page 2017
[5]

Proceedings of the ACM on Programming Languages5(OOPSLA), 1–30 (2021)

Bui, T.L., Chatterjee, K., Gautam, T., Pavlogiannis, A., Toman, V.: The reads- from equivalence for the tso and pso memory models. Proceedings of the ACM on Programming Languages5(OOPSLA), 1–30 (2021)

work page 2021
[6]

Foundations and Trends in Programming Languages1(1-2), 1–150 (2014).https://doi.org/10.1561/ 2500000011

Burckhardt, S.: Principles of Eventual Consistency. Foundations and Trends in Programming Languages1(1-2), 1–150 (2014).https://doi.org/10.1561/ 2500000011

work page 2014
[7]

ACM Program

Chakraborty, S., Krishna, S.N., Mathur, U., Pavlogiannis, A.: How hard is weak- memory testing? Proc. ACM Program. Lang.8(POPL), 1978–2009 (2024)

work page 1978
[8]

In: HPCA

Chen, Y., Lv, Y., Hu, W., Chen, T., Shen, H., Wang, P., Pan, H.: Fast complete memory consistency verification. In: HPCA. pp. 381–392. IEEE Computer Society (2009)

work page 2009
[9]

In: Saxena, N., Simon, S

Chini, P., Saivasan, P.: A framework for consistency algorithms. In: Saxena, N., Simon, S. (eds.) 40th IARCS Annual Conference on Foundations of Soft- ware Technology and Theoretical Computer Science, FSTTCS 2020, December 14-18, 2020, BITS Pilani, K K Birla Goa Campus, Goa, India (Virtual Confer- ence). LIPIcs, vol. 182, pp. 42:1–42:17. Schloss Dagstuhl ...

work page doi:10.4230/lipics.fsttcs.2020.42 2020
[10]

Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput.26(4), 1208–1244 (1997)

work page 1997
[11]

Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? J. Comput. Syst. Sci.63(4), 512–530 (2001) Complexity of Consistency Testing for the RA Semantics 17

work page 2001
[12]

Addison–Wesley (1992)

JáJá, J.: An Introduction to Parallel Algorithms. Addison–Wesley (1992)

work page 1992
[13]

Kokologiannakis, M., Lahav, O., Sagonas, K., Vafeiadis, V.: Effective stateless model checking for c/c++ concurrency. Proc. ACM Program. Lang.2(POPL) (Dec 2017).https://doi.org/10.1145/3158105,https://doi.org/10.1145/3158105

work page doi:10.1145/3158105 2017
[14]

Kokologiannakis, M., Lahav, O., Vafeiadis, V.: Kater: Automating weak mem- ory model metatheory and consistency checking. Proc. ACM Program. Lang. 7(POPL) (Jan 2023).https://doi.org/10.1145/3571212,https://doi.org/10. 1145/3571212

work page doi:10.1145/3571212 2023
[15]

Lahav, O., Boker, U.: What’s decidable about causally consistent shared memory? ACM Trans. Program. Lang. Syst.44(2) (apr 2022).https://doi.org/10.1145/ 3505273

work page 2022
[16]

Lahav, O., Boker, U.: What’s decidable about causally consistent shared memory? ACM Trans. Program. Lang. Syst.44(2), 8:1–8:55 (2022).https://doi.org/10. 1145/3505273,https://doi.org/10.1145/3505273

work page doi:10.1145/3505273 2022
[17]

In: PLDI 2017

Lahav, O., Vafeiadis, V., Kang, J., Hur, C.K., Dreyer, D.: Repairing sequential con- sistency in C/C++11. In: PLDI 2017. pp. 618–632 (2017).https://doi.org/10. 1145/3062341.3062352, technical Appendix Available athttps://plv.mpi-sws. org/scfix/full.pdf

work page arXiv 2017
[18]

Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM21(7), 558–565 (1978)

work page 1978
[19]

Lokshtanov, D., Marx, D., Saurabh, S.: Lower bounds based on the exponen- tial time hypothesis. Bull. EATCS105, 41–72 (2011),http://eatcs.org/beatcs/ index.php/beatcs/article/view/92

work page 2011
[20]

Luo, W., Demsky, B.: C11Tester: A Race Detector for C/C++ Atomics, pp. 630–

work page
[21]

Association for Computing Machinery, New York, NY, USA (2021).https: //doi.org/10.1145/3445814.3446711

work page doi:10.1145/3445814.3446711 2021
[22]

In: HPCA

Manovit, C., Hangal, S.: Completely verifying memory consistency of test program executions. In: HPCA. pp. 166–175. IEEE Computer Society (2006)

work page 2006
[23]

Margalit, R., Lahav, O.: Verifying observational robustness against a c11-style memory model. Proc. ACM Program. Lang.5(POPL) (Jan 2021).https://doi. org/10.1145/3434285,https://doi.org/10.1145/3434285

work page doi:10.1145/3434285 2021
[24]

In: LICS

Mathur, U., Pavlogiannis, A., Viswanathan, M.: The complexity of dynamic data race prediction. In: LICS. pp. 713–727. ACM (2020)

work page 2020
[25]

In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications

Norris, B., Demsky, B.: Cdschecker: checking concurrent data structures written with c/c++ atomics. In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications. p. 131–150. OOPSLA ’13, Association for Computing Machinery, New York, NY, USA(2013).https://doi.org/10.1145/2509136.2509514,ht...

work page doi:10.1145/2509136.2509514 2013
[26]

IEEE Trans

Qadeer, S.: Verifying sequential consistency on shared-memory multiprocessors by model checking. IEEE Trans. Parallel Distributed Syst.14(8), 730–741 (2003)

work page 2003
[27]

Tunç, H.C., Abdulla, P.A., Chakraborty, S., Krishna, S., Mathur, U., Pavlogiannis, A.: Optimal reads-from consistency checking for c11-style memory models. Proc. ACM Program. Lang.7(PLDI) (jun 2023).https://doi.org/10.1145/3591251, https://doi.org/10.1145/3591251

work page doi:10.1145/3591251 2023
[28]

happened-before

Vassilevska Williams, V., Williams, R.R.: Subcubic equivalences between path, matrix, and triangle problems. J. ACM65(5), 27:1–27:38 (2018) 18 R. Govind et al. A Appendix for Section 3 A.1 Proofs of Section 3.1 Lemma 2.LetX=⟨E,po,rf,mo⟩be a1-Writerexecution graph. Then, for M ∈ {SRA,RA}, we haveX|=Miffmoagrees withpoandX|=WRA. Proof.Let us begin withM=RA....

work page 2018
[29]

For every(e 1, e2)∈hbsuch that(e i, e)∈hbfor eachi∈[2], we have (e1, e2)∈ob e

work page
[30]

The idea is that any thread which executes the readrmust observewafter it has observedw ′ so that it does not read an obsolete value

Foreveryconflictingtriplets(w,r,w ′)suchthat:(ii)(w ′,r)∈ob e and(iii)(r, e)∈ po?, we have(w′,w)∈ob e. The idea is that any thread which executes the readrmust observewafter it has observedw ′ so that it does not read an obsolete value. For any evente, obe containshb, andobgrows monotonically as we move down a thread inpo- order. That is, for any(e1, e2)∈...

work page