A fine-grained look at causal effects in causal spaces
Pith reviewed 2026-05-16 22:51 UTC · model grok-4.3
The pith
Causal effects can be defined and quantified directly on events rather than variables inside the causal spaces framework.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Within the measure-theoretic framework of causal spaces, several binary definitions are introduced to determine the presence of causal effects on events, together with properties that link those definitions to (in)dependence under an intervention measure. Quantifying measures are then supplied that capture the strength and nature of causal effects on events, and common measures of treatment effect are recovered as special cases of these new quantities.
What carries the argument
Binary definitions of causal-effect presence on events, together with associated quantifying measures that connect effect strength to independence under intervention measures.
If this is right
- Binary tests decide whether a causal effect is present on a given event.
- These tests are equivalent to independence statements under the intervention measure.
- Numerical measures quantify both the strength and the directional nature of the effect.
- Standard average treatment effect and related quantities appear as special cases when events are chosen to match the usual variable formulation.
Where Pith is reading between the lines
- The same machinery could be used to pose causal questions about specific pixel patterns or token sequences without first inventing intermediate variables.
- Event-level definitions might allow causal analysis inside trained models where only the raw input tokens or activations are observable.
- The link to intervention-based independence could be used to import existing tools from measure-theoretic probability directly into causal estimation pipelines.
Load-bearing premise
The causal-spaces axiomatization gives a rich enough setting for event-level causal questions and the intervention measures used in the proofs match the interventions that matter in the target domains.
What would settle it
A concrete example or data set in which the new event-level presence test or strength measure yields a different conclusion from the standard variable-level treatment effect under the same intervention measure.
read the original abstract
The notion of causal effect is fundamental across many scientific disciplines. Traditionally, quantitative researchers have studied causal effects at the level of variables; for example, how a certain drug dose (W) causally affects a patient's blood pressure (Y). However, in many modern data domains, the raw variables-such as pixels in an image or tokens in a language model-do not have the semantic structure needed to formulate meaningful causal questions. In this paper, we offer a more fine-grained perspective by studying causal effects at the level of events, drawing inspiration from probability theory, where core notions such as independence are first given for events and sigma-algebras, before random variables enter the picture. Within the measure-theoretic framework of causal spaces, a recently introduced axiomatisation of causality, we first introduce several binary definitions that determine whether a causal effect is present, as well as proving some properties of them linking causal effect to (in)dependence under an intervention measure. Further, we provide quantifying measures that capture the strength and nature of causal effects on events, and show that we can recover the common measures of treatment effect as special cases.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces binary definitions for the presence or absence of causal effects at the level of events (rather than variables) inside the recently proposed measure-theoretic causal-spaces framework. It proves properties that link these definitions to (in)dependence under an intervention measure, develops quantitative measures of effect strength and nature, and shows that standard treatment-effect functionals (ATE, ATT, etc.) arise as special cases when events are chosen appropriately.
Significance. If the derivations hold, the work supplies a coherent event-level extension of causality that is continuous with classical variable-level measures. The explicit recovery of ATE/ATT as special cases and the measure-theoretic grounding are strengths that could support finer-grained causal questions in domains where variables lack semantic structure (images, text).
minor comments (2)
- The abstract states that properties are proved and that standard measures are recovered, but the main text should include explicit statements of the intervention-measure construction (e.g., the precise sigma-algebra and measure used for each binary definition) so that readers can verify the independence claims without external reference.
- Notation for events and intervention measures should be introduced once in a dedicated preliminary section and then used consistently; several ad-hoc symbols appear in the quantifying-measure definitions that are not cross-referenced to the binary definitions.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our manuscript and the recommendation for minor revision. We are pleased that the event-level definitions, their links to intervention-based independence, the quantitative measures, and the recovery of ATE/ATT as special cases are viewed as coherent extensions within the causal-spaces framework. No specific major comments were raised in the report.
Circularity Check
No significant circularity detected
full rationale
The paper takes the recently introduced causal-spaces axiomatization as an external starting point and introduces fresh binary definitions for the presence of event-level causal effects, proves their links to (in)dependence under intervention measures, defines quantitative strength measures, and shows that classical treatment-effect functionals are recovered as special cases. None of these steps reduces by construction to a fitted parameter, a self-citation chain, or a renaming of prior results; the derivation chain is therefore self-contained against the supplied axiomatic input.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Causal spaces provide a measure-theoretic axiomatization of causality sufficient for defining events and intervention measures
Reference graph
Works this paper leans on
-
[1]
IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers
Hoesung Lee, Katherine Calvin, Dipak Dasgupta, Gerhard Krinner, Aditi Mukherji, Peter Thorne, Christo- pher Trisos, José Romero, Paulina Aldunce, Ko Barret, et al. IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climat...
work page 2023
-
[2]
Formal Aspects of Language Modeling
Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, and Li Du. Formal Aspects of Language Modeling. arXiv preprint arXiv:2311.04329,
-
[3]
Li Du, Holden Lee, Jason Eisner, and Ryan Cotterell. When is a Language Process a Language Model? In Findings of the Association for Computational Linguistics ACL 2024, pages 11083–11094,
work page 2024
-
[4]
13 A More details on related works In this section, we extend Section 1.1 to present the definitions of all the various definitions of treatment effects used in the potential outcomes framework. A.1 Binary treatment We first introduce the potential outcomes framework in the case of a binary treatment, as in Section 1.1. Let (Ω,H,P)be the underlying probab...
work page 2000
-
[5]
=E[Y w1 −Y w0 ], with the analogous definitions for DTE, CATE, CDTE, etc. One can also consider thederivative effect[Galvao and Wang, 2015] at a particular valuewof treatment: τ(w) = dE[Yw] dw to capture the effect of a small (infinitesimal) change in the treatment. One is also free to condition this on the covariatesXto take treatment effect heterogeneit...
work page 2015
-
[6]
is preserved by marginalisation. Indeed, if, for an eventA∈H, we have ˜KU(ω,A×Ω \)̸= ˜P(A×Ω \)for someω∈Ω, then then by the definition of marginalisation, KU(ω,A)̸=P(A), and vice versa. D Language model example We follow the measure-theoretic, stochastic process view of language models [Cotterell et al., 2023, Meister et al., 2023, Du et al., 2023, 2024]....
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.