pith. sign in

arxiv: 1907.08337 · v1 · pith:ZMYA2AEPnew · submitted 2019-07-19 · 💻 cs.SE · cs.PL

Online Set-Based Dynamic Analysis for Sound Predictive Race Detection

Pith reviewed 2026-05-24 19:28 UTC · model grok-4.3

classification 💻 cs.SE cs.PL
keywords predictive data race detectioncausally-precedes relationdynamic analysisonline analysissoundnesscompletenessconcurrency bugs
0
0 comments X

The pith

Raptor computes the causally-precedes relation soundly and completely over entire execution traces as an online analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Raptor, a dynamic analysis for predictive data race detection based on the causally-precedes relation. Earlier work could apply this relation only to bounded segments of traces because full traces caused prohibitive computation. Raptor employs a set-based method that processes the trace incrementally while preserving both soundness and completeness. As a result it identifies every CP-race present in the observed execution and scales to program runs that previous techniques could not handle.

Core claim

Raptor computes CP soundly and completely as an inherently online analysis that analyzes and finds all CP-races of an execution trace in its entirety, scaling to program executions that the prior CP analysis cannot handle.

What carries the argument

A set-based dynamic analysis that tracks ordering information across the full unbounded trace.

If this is right

  • Raptor scales to program executions that the prior CP analysis cannot handle.
  • It finds data races that the prior CP analysis cannot find.
  • It analyzes the execution trace in its entirety without requiring bounded windows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The online property may support continuous monitoring of long-running services by updating race information as new events arrive.
  • The approach could be adapted to other predictive concurrency properties that previously relied on trace windows.

Load-bearing premise

The set-based dynamic analysis can maintain soundness and completeness while processing full unbounded traces without the computational blowup that forced prior work to use bounded windows.

What would settle it

An execution trace on which Raptor either reports a race that cannot occur in any alternate scheduling or misses a race that the causally-precedes relation identifies.

Figures

Figures reproduced from arXiv: 1907.08337 by Jake Roemer, Michael D. Bond.

Figure 1
Figure 1. Figure 1: Two observed program executions (of potentially diffe [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example execution in which wr(x)1 ≺CP rd(x)1 T3. The last column shows, for each event e, orderings relevant to wr(x)1 ≺CP rd(x)1 T3 for the subtrace up to and including e. The arrow represents a CP ordering established by Rule (a) of the CP definition. Smaragdakis et al.’s CP algorithm encodes the recursive definition of CP in Datalog, guaranteeing polynomial-time execution in the size of the execution… view at source ↗
Figure 3
Figure 3. Figure 3: The invariants maintained by the Raptor analysis at e [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Raptor’s analysis state updates for the execution fro [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The execution from Figure [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: An execution illustrating CCP transfer in which wr(x) [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: An execution in which wr(x) 1 ≺CP wr(x) 2 that shows more complex CCP transfer than [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
read the original abstract

Predictive data race detectors find data races that exist in executions other than the observed execution. Smaragdakis et al. introduced the causally-precedes (CP) relation and a polynomial-time analysis for sound (no false races) predictive data race detection. However, their analysis cannot scale beyond analyzing bounded windows of execution traces. This work introduces a novel dynamic analysis called Raptor that computes CP soundly and completely. Raptor is inherently an online analysis that analyzes and finds all CP-races of an execution trace in its entirety. An evaluation of a prototype implementation of Raptor shows that it scales to program executions that the prior CP analysis cannot handle, finding data races that the prior CP analysis cannot find.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces Raptor, an online set-based dynamic analysis that computes the causally-precedes (CP) relation soundly and completely for predictive data race detection. Unlike the prior bounded-window CP analysis of Smaragdakis et al., Raptor processes entire unbounded execution traces and reports all CP-races, with a prototype evaluation claiming improved scalability.

Significance. If the soundness, completeness, and scalability claims hold, the work would meaningfully extend predictive race detection to full program executions that prior CP methods cannot handle, potentially increasing the number of detectable races in large traces.

major comments (1)
  1. Abstract: the central claims that Raptor 'computes CP soundly and completely' as an 'inherently online analysis' and 'scales to program executions that the prior CP analysis cannot handle' are asserted without any derivation, algorithm description, proof sketch, or evaluation data, making verification of the load-bearing contribution impossible from the supplied text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the review and the opportunity to clarify our contributions. We address the single major comment below.

read point-by-point responses
  1. Referee: [—] Abstract: the central claims that Raptor 'computes CP soundly and completely' as an 'inherently online analysis' and 'scales to program executions that the prior CP analysis cannot handle' are asserted without any derivation, algorithm description, proof sketch, or evaluation data, making verification of the load-bearing contribution impossible from the supplied text.

    Authors: The abstract is a concise summary of the paper's contributions. The full manuscript contains the requested details: Section 3 presents the Raptor algorithm and its online set-based computation of the CP relation; Section 4 provides formal proofs of soundness and completeness; and Section 5 reports the prototype evaluation demonstrating scalability to full (unbounded) execution traces where the prior bounded-window CP analysis of Smaragdakis et al. does not apply. These sections directly support the abstract claims. If the supplied text consisted only of the abstract, we can revise the abstract to include explicit forward references to these sections. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and context describe Raptor as an algorithmic extension of the externally cited CP relation from Smaragdakis et al., with the core claim being an online set-based implementation that handles unbounded traces. No equations, fitted parameters, self-citations, or ansatzes are present that reduce the claimed soundness/completeness result to a redefinition or input by construction. The derivation chain is therefore self-contained as a novel dynamic analysis technique rather than a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities used by the analysis.

pith-pipeline@v0.9.0 · 5638 in / 1018 out tokens · 15684 ms · 2026-05-24T19:28:53.437665+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

68 extracted references · 68 canonical work pages

  1. [1]

    S. V. Adve and H.-J. Boehm. Memory Models: A Case for Rethinking Parallel Languages and Hardware. CACM, 53:90–101, 2010

  2. [2]

    S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detect ing Data Races on Weak Memory Systems. In ISCA, pages 234–243, 1991

  3. [3]

    Bessey, K

    A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem , C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Re al World. CACM, 53(2):66–75, 2010

  4. [4]

    Biswas, M

    S. Biswas, M. Cao, M. Zhang, M. D. Bond, and B. P. Wood. Lightweigh t Data Race Detection for Production Runs. In CC, pages 11–21, 2017

  5. [5]

    Biswas, M

    S. Biswas, M. Zhang, M. D. Bond, and B. Lucia. Valor: Efficient, Soft ware-Only Region Conflict Exceptions. In OOPSLA, pages 241–259, 2015

  6. [6]

    S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Framp- ton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Mos s, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo Benchmarks: Java Benc hmarking Development and Analysis. In OOPSLA, pages 169–190, 2006

  7. [7]

    H.-J. Boehm. How to miscompile programs with “benign” data race s. In HotPar, 2011

  8. [8]

    H.-J. Boehm. Position paper: Nondeterminism is Unavoidable, but D ata Races are Pure Evil. In RACES, pages 9–14, 2012

  9. [9]

    Boehm and S

    H.-J. Boehm and S. V. Adve. Foundations of the C++ Concurrency Me mory Model. In PLDI, pages 68–78, 2008

  10. [10]

    Boehm and S

    H.-J. Boehm and S. V. Adve. You Don’t Know Jack about Shared V ariables or Memory Models. CACM, 55(2):48–54, 2012

  11. [11]

    Boehm and B

    H.-J. Boehm and B. Demsky. Outlawing Ghosts: Avoiding Out-of- Thin-Air Results. In MSPC, pages 7:1–7:6, 2014

  12. [12]

    M. D. Bond, K. E. Coons, and K. S. McKinley. Pacer: Proportional D etection of Data Races. In PLDI, pages 255–268, 2010

  13. [13]

    Burckhardt, P

    S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatt e. A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs. In ASPLOS, pages 167–178, 2010

  14. [14]

    Burnim, K

    J. Burnim, K. Sen, and C. Stergiou. Testing Concurrent Programs on Relaxed Memory Models. In ISSTA, pages 122–132, 2011

  15. [15]

    Cai and L

    Y. Cai and L. Cao. Effective and Precise Dynamic Detection of Hidde n Races for Java Programs. In ESEC/FSE, pages 450–461, 2015

  16. [16]

    M. Cao, J. Roemer, A. Sengupta, and M. D. Bond. Prescient Memor y: Exposing Weak Memory Model Behavior by Looking into the Future. In ISMM, pages 99–110, 2016

  17. [17]

    F. Chen, T. F. Şerbănuţă, and G. Roşu. jPredictor: A Predictive Runtime Analysis Tool for Java. In ICSE, pages 221–230, 2008

  18. [18]

    J.-D. Choi, K. Lee, A. Loginov, R. O’Callahan, V. Sarkar, and M . Sridharan. Efficient and Precise Datarace Detection for Multithreaded Object-Oriented Programs. In PLDI, pages 258–269, 2002

  19. [19]

    Dinning and E

    A. Dinning and E. Schonberg. Detecting Access Anomalies in Programs w ith Critical Sections. In PADD, pages 85–96, 1991

  20. [20]

    Effinger-Dean, B

    L. Effinger-Dean, B. Lucia, L. Ceze, D. Grossman, and H.-J. Boeh m. IFRit: Interference-Free Regions for Dynamic Data-Race Detection. In OOPSLA, pages 467–484, 2012

  21. [21]

    Elmas, S

    T. Elmas, S. Qadeer, and S. Tasiran. Precise Race Detection and Efficient Model Checking Using Locksets. Technical report, Microsoft Research, 2005

  22. [22]

    Elmas, S

    T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: A Race and Tra nsaction-Aware Java Runtime. In PLDI, pages 245–255, 2007. Online Set-Based Dynamic Analysis for Sound Predictive Race D etection 1:33

  23. [23]

    Engler and K

    D. Engler and K. Ashcraft. RacerX: Effective, Static Detectio n of Race Conditions and Deadlocks. In SOSP, pages 237–252, 2003

  24. [24]

    Erickson, M

    J. Erickson, M. Musuvathi, S. Burckhardt, and K. Olynyk. Effec tive Data-Race Detection for the Kernel. In OSDI, pages 1–16, 2010

  25. [25]

    Eslamimehr and J

    M. Eslamimehr and J. Palsberg. Race Directed Scheduling of Co ncurrent Programs. In PPoPP, pages 301–314, 2014

  26. [26]

    Flanagan and S

    C. Flanagan and S. N. Freund. FastTrack: Efficient and Precise Dyna mic Race Detection. In PLDI, pages 121–133, 2009

  27. [27]

    Flanagan and S

    C. Flanagan and S. N. Freund. Adversarial Memory for Detecting Destructive Races. In PLDI, pages 244–254, 2010

  28. [28]

    Flanagan and S

    C. Flanagan and S. N. Freund. The RoadRunner Dynamic Analysis Fram ework for Concurrent Programs. In PASTE, pages 1–8, 2010

  29. [29]

    Godefroid and N

    P. Godefroid and N. Nagappan. Concurrency at Microsoft – An Exp loratory Survey. In EC2, 2008

  30. [30]

    B. Goetz. Plugging memory leaks with weak references, 2005. http://www-128.ibm.com/developerworks/java/library/ j-jtp11225/

  31. [31]

    J. Huang. Stateless Model Checking Concurrent Programs with M aximal Causality Reduction. In PLDI, pages 165–174, 2015

  32. [32]

    Huang, P

    J. Huang, P. O. Meredith, and G. Roşu. Maximal Sound Predictive R ace Detection with Control Flow Abstraction. In PLDI, pages 337–348, 2014

  33. [33]

    Huang and A

    J. Huang and A. K. Rajagopalan. Precise and Maximal Race Detect ion from Incomplete Traces. In OOPSLA, pages 462–476, 2016

  34. [34]

    Kasikci, C

    B. Kasikci, C. Zamfir, and G. Candea. Data Races vs. Data Race B ugs: Telling the Difference with Portend. In ASPLOS, pages 185–198, 2012

  35. [35]

    Kasikci, C

    B. Kasikci, C. Zamfir, and G. Candea. RaceMob: Crowdsourced D ata Race Detection. In SOSP, pages 406–422, 2013

  36. [36]

    Kasikci, C

    B. Kasikci, C. Zamfir, and G. Candea. Automated Classification o f Data Races Under Both Strong and Weak Memory Models. TOPLAS, 37(3):8:1–8:44, May 2015

  37. [37]

    D. Kini, U. Mathur, and M. Viswanathan. Dynamic Race Prediction in Linea r Time. In PLDI, pages 157–170, 2017

  38. [38]

    L. Lamport. Time, Clocks, and the Ordering of Events in a Distribu ted System. CACM, 21(7):558–565, 1978

  39. [39]

    L. Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Computer, 28:690–691, 1979

  40. [40]

    N. G. Leveson and C. S. Turner. An Investigation of the Therac- 25 Accidents. IEEE Computer, 26(7):18–41, 1993

  41. [41]

    Lindholm and F

    T. Lindholm and F. Yellin. The Java Virtual Machine Specification . Prentice Hall PTR, 2nd edition, 1999

  42. [42]

    P. Liu, O. Tripp, and X. Zhang. IPA: Improving Predictive Analysis w ith Pointer Analysis. In ISSTA, pages 59–69, 2016

  43. [43]

    S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from Mistakes: A Co mprehensive Study on Real World Concurrency Bug Characteristics. In ASPLOS, pages 329–339, 2008

  44. [44]

    Manson, W

    J. Manson, W. Pugh, and S. V. Adve. The Java Memory Model. In POPL, pages 378–391, 2005

  45. [45]

    Marino, M

    D. Marino, M. Musuvathi, and S. Narayanasamy. LiteRace: Effect ive Sampling for Lightweight Data-Race Detection. In PLDI, pages 134–143, 2009

  46. [46]

    F. Mattern. Virtual Time and Global States of Distributed Syst ems. In Workshop on Parallel and Distributed Algorithms, pages 215–226, 1988

  47. [47]

    Musuvathi and S

    M. Musuvathi and S. Qadeer. Iterative Context Bounding for Sys tematic Testing of Multithreaded Programs. In PLDI, pages 446–455, 2007

  48. [48]

    Naik and A

    M. Naik and A. Aiken. Conditional Must Not Aliasing for Static Race Det ection. In POPL, pages 327–338, 2007

  49. [49]

    M. Naik, A. Aiken, and J. Whaley. Effective Static Race Detection for Java. In PLDI, pages 308–319, 2006

  50. [50]

    Narayanasamy, Z

    S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically Classifying Benign and Harmful Data Races Using Replay Analysis. In PLDI, pages 22–31, 2007

  51. [51]

    Nishiyama

    H. Nishiyama. Detecting Data Races using Dynamic Escape Analysis b ased on Read Barrier. In VMRT, pages 127–138, 2004

  52. [52]

    O’Callahan and J.-D

    R. O’Callahan and J.-D. Choi. Hybrid Dynamic Data Race Detect ion. In PPoPP, pages 167–178, 2003

  53. [53]

    Nasdaq’s Facebook Glitch Came From Race Conditions , 2012

    PCWorld. Nasdaq’s Facebook Glitch Came From Race Conditions , 2012. http://www.pcworld.com/article/255911/nasdaqs_facebook_glitch_came_from_race_conditions.html

  54. [54]

    Pozniansky and A

    E. Pozniansky and A. Schuster. MultiRace: Efficient On-the-Fly Da ta Race Detection in Multithreaded C++ Programs. CCPE, 19(3):327–340, 2007

  55. [55]

    Pratikakis, J

    P. Pratikakis, J. S. Foster, and M. Hicks. LOCKSMITH: Context -Sensitive Correlation Analysis for Race Detection. In PLDI, pages 320–331, 2006

  56. [56]

    Roemer and M

    J. Roemer and M. D. Bond. An Online Dynamic Analysis for Sound Predict ive Data Race Detection. Technical Report OSU-CISRC-4/16-TR01, Computer Science & Engineering, Ohio State Univ ersity, 2016

  57. [57]

    Roemer, K

    J. Roemer, K. Genç, and M. D. Bond. High-Coverage, Unbounded So und Predictive Race Detection. In PLDI, pages 374–389, 2018

  58. [58]

    M. Said, C. Wang, Z. Yang, and K. Sakallah. Generating Data Race W itnesses by an SMT-based Analysis. In NFM, pages 313–327, 2011. 1:34 Jake Roemer and Michael D. Bond

  59. [59]

    Savage, M

    S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Ande rson. Eraser: A Dynamic Data Race Detector for Multi- Threaded Programs. In SOSP, pages 27–37, 1997

  60. [60]

    K. Sen. Race Directed Random Testing of Concurrent Programs. In PLDI, pages 11–21, 2008

  61. [61]

    T. F. Şerbănuţă, F. Chen, and G. Roşu. Maximal Causal Models f or Sequentially Consistent Systems. In RV, pages 136–150, 2013

  62. [62]

    Ševčík and D

    J. Ševčík and D. Aspinall. On Validity of Program Transformations in the Java Memory Model. In ECOOP, pages 27–51, 2008

  63. [63]

    Smaragdakis, J

    Y. Smaragdakis, J. Evans, C. Sadowski, J. Yi, and C. Flanagan. So und Predictive Race Detection in Polynomial Time. In POPL, pages 387–400, 2012

  64. [64]

    Final Report o n the August 14th Blackout in the United States and Canada

    U.S.–Canada Power System Outage Task Force. Final Report o n the August 14th Blackout in the United States and Canada. Technical report, Department of Energy, 2004

  65. [65]

    Veeraraghavan, P

    K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Det ecting and Surviving Data Races using Complemen- tary Schedules. In SOSP, pages 369–384, 2011

  66. [66]

    von Praun and T

    C. von Praun and T. R. Gross. Object Race Detection. In OOPSLA, pages 70–82, 2001

  67. [67]

    J. W. Voung, R. Jhala, and S. Lerner. RELAY: Static Race Detec tion on Millions of Lines of Code. In ESEC/FSE, pages 205–214, 2007

  68. [68]

    Y. Yu, T. Rodeheffer, and W. Chen. RaceTrack: Efficient Detectio n of Data Race Conditions via Adaptive Tracking. In SOSP, pages 221–234, 2005