A fine-grained look at causal effects in causal spaces

Junhyung Park; Yuqing Zhou

arxiv: 2512.11919 · v3 · submitted 2025-12-11 · 📊 stat.ME · cs.AI· math.ST· stat.TH

A fine-grained look at causal effects in causal spaces

Junhyung Park , Yuqing Zhou This is my paper

Pith reviewed 2026-05-16 22:51 UTC · model grok-4.3

classification 📊 stat.ME cs.AImath.STstat.TH

keywords causal effectseventscausal spacesintervention measurestreatment effectmeasure theoryevent-level causality

0 comments

The pith

Causal effects can be defined and quantified directly on events rather than variables inside the causal spaces framework.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper moves causal inquiry down to the level of individual events, following the pattern in probability theory where independence is first defined for events before random variables appear. Within causal spaces it supplies binary tests that say whether one event causally affects another, then proves these tests are equivalent to forms of independence under an intervention measure. It next introduces numerical measures of effect strength and shows that familiar treatment-effect quantities arise exactly when the events are chosen in the usual way. The shift matters in settings such as images or text where the raw data points have no natural variable-level semantics for causal questions.

Core claim

Within the measure-theoretic framework of causal spaces, several binary definitions are introduced to determine the presence of causal effects on events, together with properties that link those definitions to (in)dependence under an intervention measure. Quantifying measures are then supplied that capture the strength and nature of causal effects on events, and common measures of treatment effect are recovered as special cases of these new quantities.

What carries the argument

Binary definitions of causal-effect presence on events, together with associated quantifying measures that connect effect strength to independence under intervention measures.

If this is right

Binary tests decide whether a causal effect is present on a given event.
These tests are equivalent to independence statements under the intervention measure.
Numerical measures quantify both the strength and the directional nature of the effect.
Standard average treatment effect and related quantities appear as special cases when events are chosen to match the usual variable formulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same machinery could be used to pose causal questions about specific pixel patterns or token sequences without first inventing intermediate variables.
Event-level definitions might allow causal analysis inside trained models where only the raw input tokens or activations are observable.
The link to intervention-based independence could be used to import existing tools from measure-theoretic probability directly into causal estimation pipelines.

Load-bearing premise

The causal-spaces axiomatization gives a rich enough setting for event-level causal questions and the intervention measures used in the proofs match the interventions that matter in the target domains.

What would settle it

A concrete example or data set in which the new event-level presence test or strength measure yields a different conclusion from the standard variable-level treatment effect under the same intervention measure.

read the original abstract

The notion of causal effect is fundamental across many scientific disciplines. Traditionally, quantitative researchers have studied causal effects at the level of variables; for example, how a certain drug dose (W) causally affects a patient's blood pressure (Y). However, in many modern data domains, the raw variables-such as pixels in an image or tokens in a language model-do not have the semantic structure needed to formulate meaningful causal questions. In this paper, we offer a more fine-grained perspective by studying causal effects at the level of events, drawing inspiration from probability theory, where core notions such as independence are first given for events and sigma-algebras, before random variables enter the picture. Within the measure-theoretic framework of causal spaces, a recently introduced axiomatisation of causality, we first introduce several binary definitions that determine whether a causal effect is present, as well as proving some properties of them linking causal effect to (in)dependence under an intervention measure. Further, we provide quantifying measures that capture the strength and nature of causal effects on events, and show that we can recover the common measures of treatment effect as special cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper defines causal effects at the event level inside causal spaces, with binary presence checks and strength measures that recover ATE and similar quantities as special cases.

read the letter

The core contribution is a shift to event-level causal effects within the causal spaces framework. They introduce binary definitions for whether a causal effect holds between events, prove links to independence under an intervention measure, and supply quantitative strength measures. Standard treatment effect functionals appear as special cases when events are chosen appropriately. This follows the probability-theory pattern of starting with events before variables, which fits the motivation for high-dimensional data where pixels or tokens lack semantic structure for variable-level questions. The work builds directly on the recent causal spaces axiomatization without circular definitions or data-dependent fitting, which keeps the claims clean. The recovery of classical measures is a useful bridge to existing results. The main limitation is that the abstract states the properties are proved and the intervention measures are constructed appropriately, but the actual derivations and explicit constructions are not visible in the provided material. Without those details it is hard to judge how tight the links are or whether the measures align with interventions of practical interest in vision or language settings. The framework itself may also prove too abstract for immediate application until concrete event definitions are worked out. This is for readers already comfortable with measure-theoretic causality who want a finer language for complex data domains. A serious referee should see it because the claims are specific and checkable, even if revisions will likely be needed on the proof details and operationalization.

Referee Report

0 major / 2 minor

Summary. The paper introduces binary definitions for the presence or absence of causal effects at the level of events (rather than variables) inside the recently proposed measure-theoretic causal-spaces framework. It proves properties that link these definitions to (in)dependence under an intervention measure, develops quantitative measures of effect strength and nature, and shows that standard treatment-effect functionals (ATE, ATT, etc.) arise as special cases when events are chosen appropriately.

Significance. If the derivations hold, the work supplies a coherent event-level extension of causality that is continuous with classical variable-level measures. The explicit recovery of ATE/ATT as special cases and the measure-theoretic grounding are strengths that could support finer-grained causal questions in domains where variables lack semantic structure (images, text).

minor comments (2)

The abstract states that properties are proved and that standard measures are recovered, but the main text should include explicit statements of the intervention-measure construction (e.g., the precise sigma-algebra and measure used for each binary definition) so that readers can verify the independence claims without external reference.
Notation for events and intervention measures should be introduced once in a dedicated preliminary section and then used consistently; several ad-hoc symbols appear in the quantifying-measure definitions that are not cross-referenced to the binary definitions.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our manuscript and the recommendation for minor revision. We are pleased that the event-level definitions, their links to intervention-based independence, the quantitative measures, and the recovery of ATE/ATT as special cases are viewed as coherent extensions within the causal-spaces framework. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper takes the recently introduced causal-spaces axiomatization as an external starting point and introduces fresh binary definitions for the presence of event-level causal effects, proves their links to (in)dependence under intervention measures, defines quantitative strength measures, and shows that classical treatment-effect functionals are recovered as special cases. None of these steps reduces by construction to a fitted parameter, a self-citation chain, or a renaming of prior results; the derivation chain is therefore self-contained against the supplied axiomatic input.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the recently introduced causal spaces axiomatization as the ambient framework; no free parameters, invented entities, or additional ad-hoc axioms are mentioned in the abstract.

axioms (1)

domain assumption Causal spaces provide a measure-theoretic axiomatization of causality sufficient for defining events and intervention measures
The paper explicitly works inside this recently introduced framework and uses its intervention measures to link causal effects to independence.

pith-pipeline@v0.9.0 · 5493 in / 1284 out tokens · 43820 ms · 2026-05-16T22:51:58.237137+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages

[1]

IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers

Hoesung Lee, Katherine Calvin, Dipak Dasgupta, Gerhard Krinner, Aditi Mukherji, Peter Thorne, Christo- pher Trisos, José Romero, Paulina Aldunce, Ko Barret, et al. IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climat...

work page 2023
[2]

Formal Aspects of Language Modeling

Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, and Li Du. Formal Aspects of Language Modeling. arXiv preprint arXiv:2311.04329,

work page arXiv
[3]

When is a Language Process a Language Model? In Findings of the Association for Computational Linguistics ACL 2024, pages 11083–11094,

Li Du, Holden Lee, Jason Eisner, and Ryan Cotterell. When is a Language Process a Language Model? In Findings of the Association for Computational Linguistics ACL 2024, pages 11083–11094,

work page 2024
[4]

A.1 Binary treatment We first introduce the potential outcomes framework in the case of a binary treatment, as in Section 1.1

13 A More details on related works In this section, we extend Section 1.1 to present the definitions of all the various definitions of treatment effects used in the potential outcomes framework. A.1 Binary treatment We first introduce the potential outcomes framework in the case of a binary treatment, as in Section 1.1. Let (Ω,H,P)be the underlying probab...

work page 2000
[5]

causal effect

=E[Y w1 −Y w0 ], with the analogous definitions for DTE, CATE, CDTE, etc. One can also consider thederivative effect[Galvao and Wang, 2015] at a particular valuewof treatment: τ(w) = dE[Yw] dw to capture the effect of a small (infinitesimal) change in the treatment. One is also free to condition this on the covariatesXto take treatment effect heterogeneit...

work page 2015
[6]

Indeed, if, for an eventA∈H, we have ˜KU(ω,A×Ω \)̸= ˜P(A×Ω \)for someω∈Ω, then then by the definition of marginalisation, KU(ω,A)̸=P(A), and vice versa

is preserved by marginalisation. Indeed, if, for an eventA∈H, we have ˜KU(ω,A×Ω \)̸= ˜P(A×Ω \)for someω∈Ω, then then by the definition of marginalisation, KU(ω,A)̸=P(A), and vice versa. D Language model example We follow the measure-theoretic, stochastic process view of language models [Cotterell et al., 2023, Meister et al., 2023, Du et al., 2023, 2024]....

work page 2023

[1] [1]

IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers

Hoesung Lee, Katherine Calvin, Dipak Dasgupta, Gerhard Krinner, Aditi Mukherji, Peter Thorne, Christo- pher Trisos, José Romero, Paulina Aldunce, Ko Barret, et al. IPCC, 2023: Climate Change 2023: Synthesis Report, Summary for Policymakers. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climat...

work page 2023

[2] [2]

Formal Aspects of Language Modeling

Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, and Li Du. Formal Aspects of Language Modeling. arXiv preprint arXiv:2311.04329,

work page arXiv

[3] [3]

When is a Language Process a Language Model? In Findings of the Association for Computational Linguistics ACL 2024, pages 11083–11094,

Li Du, Holden Lee, Jason Eisner, and Ryan Cotterell. When is a Language Process a Language Model? In Findings of the Association for Computational Linguistics ACL 2024, pages 11083–11094,

work page 2024

[4] [4]

A.1 Binary treatment We first introduce the potential outcomes framework in the case of a binary treatment, as in Section 1.1

13 A More details on related works In this section, we extend Section 1.1 to present the definitions of all the various definitions of treatment effects used in the potential outcomes framework. A.1 Binary treatment We first introduce the potential outcomes framework in the case of a binary treatment, as in Section 1.1. Let (Ω,H,P)be the underlying probab...

work page 2000

[5] [5]

causal effect

=E[Y w1 −Y w0 ], with the analogous definitions for DTE, CATE, CDTE, etc. One can also consider thederivative effect[Galvao and Wang, 2015] at a particular valuewof treatment: τ(w) = dE[Yw] dw to capture the effect of a small (infinitesimal) change in the treatment. One is also free to condition this on the covariatesXto take treatment effect heterogeneit...

work page 2015

[6] [6]

Indeed, if, for an eventA∈H, we have ˜KU(ω,A×Ω \)̸= ˜P(A×Ω \)for someω∈Ω, then then by the definition of marginalisation, KU(ω,A)̸=P(A), and vice versa

is preserved by marginalisation. Indeed, if, for an eventA∈H, we have ˜KU(ω,A×Ω \)̸= ˜P(A×Ω \)for someω∈Ω, then then by the definition of marginalisation, KU(ω,A)̸=P(A), and vice versa. D Language model example We follow the measure-theoretic, stochastic process view of language models [Cotterell et al., 2023, Meister et al., 2023, Du et al., 2023, 2024]....

work page 2023