pith. machine review for the scientific record. sign in

arxiv: 2604.09589 · v1 · submitted 2026-03-02 · 💻 cs.CC · cs.LO· cs.PL

Recognition: no theorem link

Complexity of Consistency Testing for the Release-Acquire Semantics

Authors on Pith no claims yet

Pith reviewed 2026-05-15 17:07 UTC · model grok-4.3

classification 💻 cs.CC cs.LOcs.PL
keywords consistency testingrelease-acquire semanticsC11 memory modelweak memory modelsNP-hardnesspolynomial-time algorithmExponential Time Hypothesis
0
0 comments X

The pith

Consistency testing for release-acquire memory is polynomial-time when each location has at most one writer.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that deciding whether an observed sequence of reads and writes is consistent under release-acquire semantics can be done in polynomial time when each memory location is written by only one thread. This restriction does not help for sequential consistency, where the problem remains NP-hard. The result is complemented by hardness: the decision problem becomes NP-complete as soon as two threads can write the same location, and with three writers per location it cannot be solved in 2 to the o(k) times polynomial in n time under the Exponential Time Hypothesis.

Core claim

Consistency testing under the release-acquire variant of the C11 memory model lies in P when every memory location is written by at most one thread. The same problem is NP-hard when two threads may write any given location, and under the Exponential Time Hypothesis it admits no 2^{o(k)} n^{O(1)} algorithm once three writers per location are permitted.

What carries the argument

A polynomial-time graph reachability procedure that exploits the single-writer restriction to construct a total order on writes and check release-acquire synchronization edges without enumerating interleavings.

Load-bearing premise

The release-acquire semantics are taken exactly as defined in the C11 standard and prior literature, without extra constraints on thread or location counts.

What would settle it

An execution with one writer per location on which the claimed polynomial algorithm outputs 'consistent' yet an exhaustive enumeration of possible interleavings shows a violation of release-acquire ordering.

Figures

Figures reproduced from arXiv: 2604.09589 by B. Srivathsan, R. Govind, Sanchari Sil, S. Krishna.

Figure 1
Figure 1. Figure 1: Violations of coherence and causality. Values in the [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: From left: rf0 ⊑ rf1 ⊑ rf2. rf0 has a violation of weak-read-coherence: r3 rf−1 0 −−−→ w1 po −→ w2 hb0 −−→ r3. rf1 has a violation of weak-read-coherence: r4 rf−1 1 −−−→ w2 po −→ w3 hb1 −−→ r4. rf2 does not have any more violations of weak-read-coherence. porf-acyclicity is also satisfied by rf2. The underlying partial execution graph is WRA-consistent. 3.2 Algorithm Our algorithm starts from an rf0 which … view at source ↗
Figure 3
Figure 3. Figure 3: From left: rf0 ⊑ rf1. rf0 has a violation of weak-read￾coherence: r3 rf−1 0 −−−→ w1 po −→ w2 hb −→ r3. rf1 has no weak-read-coherence violation but does not satisfy porf￾acyclicity: w5 rf1 −−→ r3 po −→ w4 rf −→ r2 po −→ w5. The partial execution graph is not WRA-consistent. From rf i to rf i+1. Pick a violating cycle for weak-read-coherence: ri rf−1 i −−→ wi hbi −−→ w ′ i hbi −−→ ri where all three events … view at source ↗
Figure 4
Figure 4. Figure 4: From triangles to violations of weak￾read-coherence Our idea for the reduction is as follows. We will construct X in such a way that for every r(x, v) there is a unique w(x, v) present in the entire program. There￾fore, the rf relation is fixed. If G has a triangle, we will have a shared variable a and a cycle of the form: w(a, 0) po −→ w(a, 1) hb −→ r(a, 0) rf−1 −−→ w(a, 0). This cycle vi￾olates weak-read… view at source ↗
Figure 5
Figure 5. Figure 5: Threads in the reduction, for i ∈ {1, 2, . . . , k} Thus, if there was a 2 o(k) · n O(1) algo￾rithm for the VM problem, we could solve 3-CNF-SAT in 2 o(k) .(k + m) O(1) time: from φ, construct the VM in￾stance in O(k + m) time, and then use the 2 o(k) · (k + m) O(1) procedure for VM, thereby contradicting ETH. To show the last item above, we make use of what is known in the literature as a range reduction … view at source ↗
Figure 6
Figure 6. Figure 6: A partial execution X for a graph G = ({1, 2, 3}, {(1, 2),(2, 1),(1, 3),(3, 1),(2, 3),(3, 2)}). We have (w(a1, 0), r(a1, 0)) ∈ hb violating weak read coherence in X. shows the reduction on a graph with 3 vertices {1, 2, 3}, having edges between every distinct pair of vertices. Correctness and Time Complexity. Assume that G has a triangle (x, y, z). Wlg, let x < y < z. Then w(ax, 1) hb r(ax, 0) as follows: … view at source ↗
Figure 7
Figure 7. Figure 7: Violation of (a) relaxed-write-coherence: [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Threads in the reduction for Relaxed The threads Init1 and Init0 initialize the valuation used to evaluate the for￾mula. For each variable xi correspond￾ing to the Boolean variables of φ, Init1 writes 1 and Init0 writes 0. Txi has a read statement r(vi , 0) followed by r(vi , 1) (which denotes reading true for xi) and sets all clauses containing xi to true, using the statement w(cj , 1) for all j ∈ Sxi . S… view at source ↗
read the original abstract

In a seminal work, Gibbons and Korach studied the complexity of deciding whether an observed sequence of reads and writes of a multi-threaded program admits a sequentially consistent interleaving. They showed the problem to be NP-hard even under strong syntactic restrictions. More recently, Chakraborty et al. considered the problem for weak memory models and proved that NP-hardness remains even when the number of threads, the number of memory locations, and the value domain are all bounded. In this paper we revisit the problem for the release-acquire variants of the C11 memory model. Our main positive result is that consistency testing can be done in polynomial-time when each memory location is written by at most one thread (multiple readers are allowed). Notably, this restriction is already NP-hard for sequential consistency. We complement this upper bound with tight hardness results: the problem is NP-hard when two threads may write to the same location, and allowing three writers per location rules out 2^{o(k)}.n^{O(1)} algorithms under the Exponential Time Hypothesis, where k denotes the number of threads, and n the number of memory operations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper studies the complexity of deciding whether an observed execution trace is consistent with the release-acquire (RA) fragment of the C11 memory model. The central positive result is a polynomial-time algorithm for consistency testing when each memory location has at most one writer (multiple readers permitted). This is contrasted with the known NP-hardness of the same syntactic restriction under sequential consistency. The paper complements the upper bound with matching hardness: NP-hardness already holds when two threads may write the same location, and an ETH-based lower bound rules out 2^{o(k)} n^{O(1)} algorithms when three writers per location are allowed (k = number of threads, n = number of operations).

Significance. If the results hold, the work supplies a clean complexity map for RA consistency checking that is absent from prior literature. The polynomial-time algorithm under the single-writer restriction is a concrete algorithmic contribution that exploits the RA semantics in a way that does not apply to sequential consistency; the hardness results are tight and obtained via standard parameterized-complexity reductions. The paper therefore both improves the state of the art for weak-memory verification and clarifies why the choice of memory model qualitatively changes the complexity of the consistency problem.

minor comments (3)
  1. [§4] The high-level description of the polynomial-time algorithm (presumably in §4) would be easier to follow if a short pseudocode block or a diagram of the dependency-graph construction were added.
  2. [Theorem 5.3] In the ETH lower-bound statement, the parameter k is defined as the number of threads; it would help to restate this explicitly in the theorem statement rather than only in the surrounding text.
  3. [§2] A few citations to the original C11 RA formalization (e.g., the precise happens-before and release-acquire axioms used) appear only in the introduction; repeating the key definitions in a dedicated preliminary section would improve self-containedness.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. The summary accurately captures the main contributions, and we are glad that the distinction between the release-acquire model and sequential consistency is viewed as a useful clarification for the complexity of consistency checking.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper establishes a polynomial-time algorithm for RA consistency testing under the single-writer-per-location restriction via direct algorithmic construction, and complements it with standard NP-hardness reductions (from known problems) and ETH lower bounds. These steps rely on independent complexity-theoretic techniques and the external C11 RA model definition; no self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear. The contrast with SC hardness is external and does not reduce the central claims to their own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The results rest on the standard definition of release-acquire consistency from the C11 model and basic facts from complexity theory such as NP-completeness and the Exponential Time Hypothesis; no free parameters or new entities are introduced.

axioms (2)
  • domain assumption Release-acquire semantics defined exactly as in the C11 standard and prior literature on weak memory models
    The polynomial algorithm and all reductions are with respect to this fixed model definition.
  • standard math Standard NP-completeness and ETH assumptions from complexity theory
    Used to establish the hardness results for two and three writers.

pith-pipeline@v0.9.0 · 5507 in / 1286 out tokens · 66817 ms · 2026-05-15T17:07:35.639473+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    In: CAV (1)

    Agarwal, P., Chatterjee, K., Pathak, S., Pavlogiannis, A., Toman, V.: Stateless model checking under a reads-value-from equivalence. In: CAV (1). Lecture Notes in Computer Science, vol. 12759, pp. 341–366. Springer (2021)

  2. [2]

    Ahamad, M., Neiger, G., Burns, J.E., Kohli, P., Hutto, P.W.: Causal mem- ory: Definitions, implementation, and programming. Distrib. Comput.9(1), 37–49 (mar 1995).https://doi.org/10.1007/BF01784241,https://doi.org/10.1007/ BF01784241

  3. [3]

    In: POPL’11

    Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concurrency. In: POPL’11. pp. 55–66. ACM (2011).https://doi.org/10.1145/ 1926385.1926394

  4. [4]

    In: POPL

    Bouajjani, A., Enea, C., Guerraoui, R., Hamza, J.: On verifying causal consistency. In: POPL. pp. 626–638. ACM (2017)

  5. [5]

    Proceedings of the ACM on Programming Languages5(OOPSLA), 1–30 (2021)

    Bui, T.L., Chatterjee, K., Gautam, T., Pavlogiannis, A., Toman, V.: The reads- from equivalence for the tso and pso memory models. Proceedings of the ACM on Programming Languages5(OOPSLA), 1–30 (2021)

  6. [6]

    Foundations and Trends in Programming Languages1(1-2), 1–150 (2014).https://doi.org/10.1561/ 2500000011

    Burckhardt, S.: Principles of Eventual Consistency. Foundations and Trends in Programming Languages1(1-2), 1–150 (2014).https://doi.org/10.1561/ 2500000011

  7. [7]

    ACM Program

    Chakraborty, S., Krishna, S.N., Mathur, U., Pavlogiannis, A.: How hard is weak- memory testing? Proc. ACM Program. Lang.8(POPL), 1978–2009 (2024)

  8. [8]

    In: HPCA

    Chen, Y., Lv, Y., Hu, W., Chen, T., Shen, H., Wang, P., Pan, H.: Fast complete memory consistency verification. In: HPCA. pp. 381–392. IEEE Computer Society (2009)

  9. [9]

    In: Saxena, N., Simon, S

    Chini, P., Saivasan, P.: A framework for consistency algorithms. In: Saxena, N., Simon, S. (eds.) 40th IARCS Annual Conference on Foundations of Soft- ware Technology and Theoretical Computer Science, FSTTCS 2020, December 14-18, 2020, BITS Pilani, K K Birla Goa Campus, Goa, India (Virtual Confer- ence). LIPIcs, vol. 182, pp. 42:1–42:17. Schloss Dagstuhl ...

  10. [10]

    Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput.26(4), 1208–1244 (1997)

  11. [11]

    Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? J. Comput. Syst. Sci.63(4), 512–530 (2001) Complexity of Consistency Testing for the RA Semantics 17

  12. [12]

    Addison–Wesley (1992)

    JáJá, J.: An Introduction to Parallel Algorithms. Addison–Wesley (1992)

  13. [13]

    Kokologiannakis, M., Lahav, O., Sagonas, K., Vafeiadis, V.: Effective stateless model checking for c/c++ concurrency. Proc. ACM Program. Lang.2(POPL) (Dec 2017).https://doi.org/10.1145/3158105,https://doi.org/10.1145/3158105

  14. [14]

    Kokologiannakis, M., Lahav, O., Vafeiadis, V.: Kater: Automating weak mem- ory model metatheory and consistency checking. Proc. ACM Program. Lang. 7(POPL) (Jan 2023).https://doi.org/10.1145/3571212,https://doi.org/10. 1145/3571212

  15. [15]

    Lahav, O., Boker, U.: What’s decidable about causally consistent shared memory? ACM Trans. Program. Lang. Syst.44(2) (apr 2022).https://doi.org/10.1145/ 3505273

  16. [16]

    Lahav, O., Boker, U.: What’s decidable about causally consistent shared memory? ACM Trans. Program. Lang. Syst.44(2), 8:1–8:55 (2022).https://doi.org/10. 1145/3505273,https://doi.org/10.1145/3505273

  17. [17]

    In: PLDI 2017

    Lahav, O., Vafeiadis, V., Kang, J., Hur, C.K., Dreyer, D.: Repairing sequential con- sistency in C/C++11. In: PLDI 2017. pp. 618–632 (2017).https://doi.org/10. 1145/3062341.3062352, technical Appendix Available athttps://plv.mpi-sws. org/scfix/full.pdf

  18. [18]

    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM21(7), 558–565 (1978)

  19. [19]

    Lokshtanov, D., Marx, D., Saurabh, S.: Lower bounds based on the exponen- tial time hypothesis. Bull. EATCS105, 41–72 (2011),http://eatcs.org/beatcs/ index.php/beatcs/article/view/92

  20. [20]

    Luo, W., Demsky, B.: C11Tester: A Race Detector for C/C++ Atomics, pp. 630–

  21. [21]

    Association for Computing Machinery, New York, NY, USA (2021).https: //doi.org/10.1145/3445814.3446711

  22. [22]

    In: HPCA

    Manovit, C., Hangal, S.: Completely verifying memory consistency of test program executions. In: HPCA. pp. 166–175. IEEE Computer Society (2006)

  23. [23]

    Margalit, R., Lahav, O.: Verifying observational robustness against a c11-style memory model. Proc. ACM Program. Lang.5(POPL) (Jan 2021).https://doi. org/10.1145/3434285,https://doi.org/10.1145/3434285

  24. [24]

    In: LICS

    Mathur, U., Pavlogiannis, A., Viswanathan, M.: The complexity of dynamic data race prediction. In: LICS. pp. 713–727. ACM (2020)

  25. [25]

    In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications

    Norris, B., Demsky, B.: Cdschecker: checking concurrent data structures written with c/c++ atomics. In: Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications. p. 131–150. OOPSLA ’13, Association for Computing Machinery, New York, NY, USA(2013).https://doi.org/10.1145/2509136.2509514,ht...

  26. [26]

    IEEE Trans

    Qadeer, S.: Verifying sequential consistency on shared-memory multiprocessors by model checking. IEEE Trans. Parallel Distributed Syst.14(8), 730–741 (2003)

  27. [27]

    Tunç, H.C., Abdulla, P.A., Chakraborty, S., Krishna, S., Mathur, U., Pavlogiannis, A.: Optimal reads-from consistency checking for c11-style memory models. Proc. ACM Program. Lang.7(PLDI) (jun 2023).https://doi.org/10.1145/3591251, https://doi.org/10.1145/3591251

  28. [28]

    happened-before

    Vassilevska Williams, V., Williams, R.R.: Subcubic equivalences between path, matrix, and triangle problems. J. ACM65(5), 27:1–27:38 (2018) 18 R. Govind et al. A Appendix for Section 3 A.1 Proofs of Section 3.1 Lemma 2.LetX=⟨E,po,rf,mo⟩be a1-Writerexecution graph. Then, for M ∈ {SRA,RA}, we haveX|=Miffmoagrees withpoandX|=WRA. Proof.Let us begin withM=RA....

  29. [29]

    For every(e 1, e2)∈hbsuch that(e i, e)∈hbfor eachi∈[2], we have (e1, e2)∈ob e

  30. [30]

    The idea is that any thread which executes the readrmust observewafter it has observedw ′ so that it does not read an obsolete value

    Foreveryconflictingtriplets(w,r,w ′)suchthat:(ii)(w ′,r)∈ob e and(iii)(r, e)∈ po?, we have(w′,w)∈ob e. The idea is that any thread which executes the readrmust observewafter it has observedw ′ so that it does not read an obsolete value. For any evente, obe containshb, andobgrows monotonically as we move down a thread inpo- order. That is, for any(e1, e2)∈...