Do Neurons Dream of Primitive Operators? Wake-Sleep Compression Rediscovers Schank's Event Semantics

Peter Balogh

arxiv: 2603.25975 · v2 · submitted 2026-03-26 · 💻 cs.LG · cs.AI· cs.CL

Do Neurons Dream of Primitive Operators? Wake-Sleep Compression Rediscovers Schank's Event Semantics

Peter Balogh This is my paper

Pith reviewed 2026-05-14 23:51 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CL

keywords event semanticsconceptual dependency theorylibrary learningminimum description lengthwake-sleep algorithmSchank primitivesATOMICGLUCOSE

0 comments

The pith

Automatic compression of event state pairs rediscovers Schank's primitive operators

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that wake-sleep compression applied to before-and-after world states can automatically discover primitive operators matching Schank's conceptual dependency theory for events. A sympathetic reader cares because this implies core event semantics may arise from information-theoretic compression rather than from manual linguistic design. The system starts with generic primitives and extracts a library that covers all events in tested datasets while staying close to optimal description length. It also identifies new operators for emotional states absent from the original theory.

Core claim

Starting from four generic primitives, the wake-sleep library learning procedure applied to event before/after state pairs discovers operators that correspond to Schank's core primitives such as ATRANS for possession transfer via MOVE_PROP_has, PTRANS for location change, MTRANS for knowledge transfer, and INGEST for consumption, together with compound operators and novel ones for mental states like CHANGE_wants. The discovered library achieves minimum description length within 4 percent of the hand-coded Schank library on synthetic data at 100 percent coverage and full coverage on the ATOMIC and GLUCOSE corpora.

What carries the argument

The wake-sleep algorithm that searches for operator sequences explaining each event transformation in the wake phase and then extracts recurring sequences as library primitives under minimum description length in the sleep phase.

If this is right

The discovered library covers 100% of events in ATOMIC and GLUCOSE, exceeding Schank's coverage of 10% and 31%.
Libraries discovered on one corpus transfer to the other with under 1 bit per event increase in description length.
Novel emotional operators such as CHANGE_feels and CHANGE_is emerge as dominant in the library.
Compound operators for complex actions like mailing arise as compositions of basic primitives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This suggests that event semantics could be grounded in a small set of compressible transformations applicable across domains.
The method might extend to learning primitives for planning or story understanding tasks.
If the operators remain stable across additional datasets, they could provide a basis for interpretable commonsense reasoning in AI.

Load-bearing premise

Representing events solely as before-and-after pairs of world states supplies enough information for minimum-description-length compression to recover meaningful primitives.

What would settle it

Observing that the same compression procedure applied to events represented in a different format, such as textual descriptions without explicit states, fails to recover similar primitives or produces libraries with substantially worse compression ratios would falsify the central claim.

read the original abstract

We show that they do. Roger Schank's conceptual dependency theory proposed that all human events decompose into primitive operations -- ATRANS (transfer of possession), PTRANS (physical movement), MTRANS (information transfer), and others -- hand-coded from linguistic intuition. We ask: can the same primitives be discovered automatically through compression pressure alone? We adapt DreamCoder's wake-sleep library learning to event state transformations. Given events as before/after world-state pairs, the system searches for operator compositions explaining each event (wake), then extracts recurring patterns as library entries under Minimum Description Length (sleep). Starting from four generic primitives, it discovers operators mapping to Schank's core: MOVE_PROP_has = ATRANS, CHANGE_location = PTRANS, SET_knows = MTRANS, SET_consumed = INGEST, plus compound operators (e.g., "mail" = ATRANS composed with PTRANS) and novel emotional-state operators absent from Schank's taxonomy. We validate on synthetic events, ATOMIC (Sap et al., 2019), and GLUCOSE (Mostafazadeh et al., 2020). On synthetic data, the discovered library achieves MDL within 4% of Schank's hand-coded primitives at 100% coverage (vs. Schank's 81%). On ATOMIC, Schank covers only 10%; on GLUCOSE, 31%. The discovered library covers 100% of both, dominated by mental/emotional operators -- CHANGE_wants (20%), CHANGE_feels (18%), CHANGE_is (18%) -- none in Schank's original taxonomy. Libraries discovered from one corpus transfer to the other with under 1 bit/event degradation despite different annotation schemes and domains, suggesting the operators are information-theoretically determined structure, not dataset artifacts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Wake-sleep compression recovers Schank primitives from event states but the states likely already encode the key distinctions.

read the letter

This paper shows that a wake-sleep system can compress event state changes into a library close to Schank's primitives plus new mental operators, and that the library transfers across datasets with little loss. What is new is applying the DreamCoder framework to before-and-after world states for events, recovering the core operators like MOVE_PROP_has for ATRANS and adding ones for emotional states that Schank did not have. The 100% coverage on ATOMIC and GLUCOSE, compared to Schank's low numbers, plus the transfer result, gives a clear quantitative picture. The paper does well by including the MDL comparison on synthetic data and by testing transfer between independent sources. That external check makes the results more convincing than a single-dataset fit. The soft spots center on the state representations. The input pairs use predicates that already distinguish the event types in ways that match the output operators, so the discovery may be recombining pre-factored elements rather than finding them from scratch. The paper needs more on how the states are built and how operators are mapped to Schank labels. The lack of error bars and implementation details also leaves the numbers a bit hard to assess. Readers working on library learning or commonsense event representations will get the most from it. It deserves a serious referee because the transfer experiment and coverage claims are specific enough to evaluate and build on. I would recommend sending it to peer review.

Referee Report

3 major / 2 minor

Summary. The paper adapts DreamCoder's wake-sleep library learning to event semantics, representing events as before/after world-state pairs. Starting from four generic primitives and optimizing for minimum description length, it claims to rediscover Schank's core primitives (MOVE_PROP_has mapping to ATRANS, CHANGE_location to PTRANS, SET_knows to MTRANS, SET_consumed to INGEST) plus compound operators and novel emotional-state ones. On synthetic data the discovered library matches Schank's MDL within 4% at 100% coverage (vs. Schank's 81%); on ATOMIC and GLUCOSE it achieves 100% coverage where Schank covers only 10-31%, with libraries transferring across corpora at <1 bit/event degradation.

Significance. If the operators emerge from compression pressure on state transformations without the input predicates already encoding the target semantic distinctions, the result would supply a computational demonstration that Schank-style primitives are information-theoretically natural, strengthening the link between library learning and cognitive semantics. The cross-corpus transfer supplies an external check that reduces circularity. The quantitative MDL and coverage numbers are promising but currently rest on under-specified state encodings and evaluation details.

major comments (3)

[Methods (event representation and input encoding)] The event representation is described only as 'before/after world-state pairs' (abstract and methods). The discovered operators are named MOVE_PROP_has, CHANGE_location, SET_knows, SET_consumed, CHANGE_wants, CHANGE_feels, CHANGE_is; if the input predicates already factor states along precisely these dimensions, MDL compression will recover them by construction rather than inducing them from raw transitions. The paper must specify the exact predicate vocabulary and feature set used for the states in each dataset.
[Results (synthetic experiments)] The synthetic-data result (MDL within 4% of Schank at 100% coverage) and the mapping of discovered operators to Schank labels are reported without error bars, run-to-run variance, or description of whether the mapping is automatic or post-hoc manual. These omissions make it impossible to judge whether the quantitative match is robust or sensitive to the MDL trade-off parameter.
[Results (transfer experiments)] The transfer claim (<1 bit/event degradation between ATOMIC and GLUCOSE) is load-bearing for the assertion that the operators reflect 'information-theoretically determined structure.' The paper must detail how a library learned on one corpus is applied to events from the other (different annotation schemes), including the exact MDL computation and any re-encoding steps required.

minor comments (2)

[Abstract] The abstract states 'four generic primitives' without naming them; the methods section should list them explicitly.
[Results] Coverage percentages (100% on ATOMIC/GLUCOSE) should be accompanied by the total number of events and the precise definition of 'coverage' (e.g., whether every event must be exactly reconstructed or merely compressed below a threshold).

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's detailed feedback on our manuscript. The comments highlight important areas for clarification in the methods and results sections. We have prepared revisions to address each point and provide point-by-point responses below.

read point-by-point responses

Referee: The event representation is described only as 'before/after world-state pairs' (abstract and methods). The discovered operators are named MOVE_PROP_has, CHANGE_location, SET_knows, SET_consumed, CHANGE_wants, CHANGE_feels, CHANGE_is; if the input predicates already factor states along precisely these dimensions, MDL compression will recover them by construction rather than inducing them from raw transitions. The paper must specify the exact predicate vocabulary and feature set used for the states in each dataset.

Authors: We agree that the input encoding details are crucial for interpreting whether the primitives are discovered or presupposed. In the revised manuscript, we will add a dedicated subsection in the Methods detailing the predicate vocabulary for each dataset. For the synthetic data, states are represented using a minimal set of generic predicates (has, location, knows, consumed, wants, feels, is) without pre-factoring into Schank categories. For ATOMIC and GLUCOSE, we map the original annotations to state predicates using a consistent schema that does not encode the target operators a priori. This ensures the compression process induces the operators from the transitions. revision: yes
Referee: The synthetic-data result (MDL within 4% of Schank at 100% coverage) and the mapping of discovered operators to Schank labels are reported without error bars, run-to-run variance, or description of whether the mapping is automatic or post-hoc manual. These omissions make it impossible to judge whether the quantitative match is robust or sensitive to the MDL trade-off parameter.

Authors: We acknowledge the lack of statistical details in the current version. The mapping was performed post-hoc by the authors based on semantic equivalence (e.g., MOVE_PROP_has corresponds to ATRANS because it transfers possession). We will revise to include results from 10 independent runs with different random seeds, reporting mean MDL and standard deviation. The MDL trade-off parameter was set to the default value from DreamCoder; we will add a sensitivity analysis showing robustness across a range of values. This will demonstrate that the 4% match is stable. revision: yes
Referee: The transfer claim (<1 bit/event degradation between ATOMIC and GLUCOSE) is load-bearing for the assertion that the operators reflect 'information-theoretically determined structure.' The paper must detail how a library learned on one corpus is applied to events from the other (different annotation schemes), including the exact MDL computation and any re-encoding steps required.

Authors: We will expand the Results section on transfer experiments to include the precise procedure. Libraries are learned on one corpus using its state encoding. To apply to the other, events are re-encoded into the predicate vocabulary of the target corpus, but the library operators are kept fixed. The MDL is computed as the sum of the description length of the library plus the cost of expressing each event using the library (with the same lambda parameter). No additional re-encoding of the library itself is performed. We will include pseudocode for this process and report the exact bit degradations with variance. revision: yes

Circularity Check

0 steps flagged

Wake-sleep MDL library learning derives operators from given state pairs without reduction to inputs by construction

full rationale

The derivation applies wake phase search for operator compositions on before/after state pairs followed by sleep-phase MDL extraction of recurring patterns, starting from four generic primitives. This process is not equivalent to the reported library by definition; the objective explicitly minimizes description length on held-out events. Transfer results across ATOMIC and GLUCOSE (different annotation schemes) supply an independent check. No quoted step equates a fitted parameter or self-cited uniqueness theorem to the final operators, and the input predicates do not force the specific library entries recovered under compression.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that events can be faithfully represented as before/after state pairs and that minimum-description-length compression will surface cognitively natural operators; no new physical entities are postulated.

free parameters (2)

initial generic primitives
Four generic starting operators are supplied by the authors; their exact definitions are not stated in the abstract.
MDL trade-off parameter
The relative weighting between data cost and library cost is a tunable parameter whose value is not reported.

axioms (1)

domain assumption Minimum description length is an appropriate objective for recovering semantically meaningful operators
Invoked throughout the wake-sleep procedure described in the abstract.

pith-pipeline@v0.9.0 · 5638 in / 1434 out tokens · 51462 ms · 2026-05-14T23:51:42.078537+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We adapt DreamCoder’s wake-sleep library learning algorithm to the domain of event state transformations... Starting from four generic state-change primitives, the system discovers specialized operators
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery from Law of Logic unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MDL = L(Library) + sum L(Program_i | Library)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.