Predictive Objectives Discard Exogenous Control-Relevant Features: A Controlled Mechanistic Study

Ayan Pendharkar

arxiv: 2606.30068 · v1 · pith:X4AU3M4Jnew · submitted 2026-06-29 · 💻 cs.LG

Predictive Objectives Discard Exogenous Control-Relevant Features: A Controlled Mechanistic Study

Ayan Pendharkar This is my paper

Pith reviewed 2026-06-30 07:02 UTC · model grok-4.3

classification 💻 cs.LG

keywords predictive objectivesJEPArepresentation learningexogenous featurescontrol relevancereward-free learningreinforcement learning

0 comments

The pith

Reward-free predictive objectives discard exogenous control-relevant features even when they are easy to encode.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that joint-embedding predictive objectives optimize for temporal predictability and therefore drop features an agent cannot control even when those features remain useful for choosing actions. A controlled 2x2 design varies controllability and relevance separately with a knob that isolates predictability from control utility. Across six objectives, every reward-free variant leaves the exogenous control-relevant feature at near-chance accuracy while a reward-grounded version keeps it. Recovery requires only 2 percent reward-labeled transitions and holds across two environments and latent sizes from 16 to 1024. The learned latents also achieve far less class separation than a supervised reference.

Core claim

Joint-embedding predictive objectives learn representations by predicting future latents and thereby discard exogenous yet control-relevant features; in a 2x2 design that varies controllability and relevance independently, all evaluated reward-free predictive objectives leave the exogenous control-relevant feature near chance accuracy while a reward-grounded variant retains it selectively, with as little as 2 percent reward-labeled transitions sufficient to recover the feature across environments and latent dimensions.

What carries the argument

A 2x2 experimental design that independently varies a feature's controllability and control-relevance using a predictability knob that decouples temporal predictability from control relevance.

If this is right

Reward-free predictive objectives achieve near-chance accuracy on exogenous control-relevant features.
Reward-grounded variants retain the same features selectively.
Two percent reward-labeled transitions suffice to recover the dropped feature.
The effect appears in two environments with different surface forms and across latent dimensions 16 to 1024.
The JEPA latent realizes only a small fraction of the class separation attained by a supervised reference.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Purely predictive self-supervised methods may systematically under-represent uncontrollable but decision-relevant variables in control tasks.
Minimal reward supervision could serve as a general, low-cost correction for representation quality in reinforcement learning pipelines.
Hybrid objectives that add sparse reward signals without full supervision merit direct comparison against existing predictive baselines.

Load-bearing premise

The predictability knob truly separates temporal predictability from control-relevance without other confounds.

What would settle it

In the same 2x2 environments, measure accuracy on the exogenous control-relevant feature after training with any reward-free objective; accuracy substantially above chance would falsify the claim that these objectives discard the feature.

Figures

Figures reproduced from arXiv: 2606.30068 by Ayan Pendharkar.

**Figure 1.** Figure 1: Objective × cell retention matrix (linear-probe accuracy on QuadrantEnv; green ≥ 0.75 retained, red < 0.75 not retained). The exogenous control-relevant column (cell 4) is the paper’s central result: only the reward-grounded variant and the references retain the feature. jepa_reward retains exactly the relevant cells (1 and 4) and not the irrelevant ones (2 and 3), whereas jepa_ctrl partially recovers the … view at source ↗

**Figure 2.** Figure 2: Cell-4 results across both environments. Left: linear-probe accuracy (chance = 0.50, retain threshold = 0.75) for each objective on the exogenous control-relevant feature. Right: InfoNCE mutual information (nats; log 2 ≈ 0.693 is the maximum for a one-bit feature). All evaluated reward-free predictive objectives, including action-conditioned JEPA, controllabilitybased JEPA, and inverse dynamics with rando… view at source ↗

**Figure 3.** Figure 3: Cell-4 retention versus latent dimension for jepa and jepa_reward on QuadrantEnv and SwitchColor. JEPA stays near chance across every latent dimension from 16 to 1024, never approaching the retain threshold. jepa_reward stays near perfect from 16 to 512, with a slight dip at 1024 attributable to training budget at that scale. The failure is objective-structural, not architectural. Increasing latent capacit… view at source ↗

**Figure 4.** Figure 4: Cell-4 retention versus the fraction of reward-labeled transitions (log scale, QuadrantEnv). The retain threshold (0.75) is first crossed at a reward-label fraction of 0.02, two percent of transitions (mean accuracy 0.77). Even 1% of labeled transitions yields linear-probe accuracy 0.72, already well above the chance level of 0.50. The reward signal is highly efficient for recovering the exogenous control-… view at source ↗

**Figure 5.** Figure 5: Class separation (whitened centroid distance between c = 0 and c = 1 latent representations) for jepa, recon, and the supervised reference at convergence (cell 4, prepeat = 0.5, full training). The horizontal reference marks the analytical bisimulation distance (1.0). JEPA realizes substantially less class separation than the references (0.105 versus ≈ 1.998); the separation gap is 1.893. The cell-4 featur… view at source ↗

read the original abstract

Joint-embedding predictive (JEPA-style) objectives learn representations by predicting future latents. In doing so they can discard features that are exogenous (uncontrollable by the agent) yet control-relevant, even when those features are trivially encodable. This occurs because the objective optimizes temporal predictability rather than control-relevance. We isolate this failure mode in a controlled 2x2 experimental design that varies feature controllability and relevance independently, using a predictability knob that decouples a feature's temporal predictability from its control-relevance. Comparing six objectives: reconstruction, JEPA, action-conditioned JEPA, controllability-based JEPA, inverse dynamics under a random policy, and reward-grounded JEPA, we observe that all evaluated reward-free predictive objectives leave the exogenous control-relevant feature near chance accuracy, while a reward-grounded variant retains it selectively. The remedy is label-efficient and robust: as little as 2% of reward-labeled transitions recovers the feature, the effect holds across two environments with different surface forms, and it persists across latent dimensions from 16 to 1024. Comparing the learned latent geometry against bisimulation theory's prediction, the JEPA latent realizes only a small fraction of the class separation a supervised reference attains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows reward-free predictive objectives drop exogenous control-relevant features in a 2x2 setup, but the predictability knob's decoupling is unverified and load-bearing.

read the letter

The core observation is that all the reward-free objectives tested leave the exogenous control-relevant feature near chance, while the reward-grounded one keeps it, and 2% reward labels recover it. That pattern appears across two environments and a wide range of latent sizes.

What is new is the attempt to run a controlled 2x2 by varying controllability and relevance separately via the predictability knob, plus the side-by-side of six objectives and the bisimulation geometry check. The paper does a reasonable job documenting that the effect is stable under those surface changes and that the remedy is label-efficient.

The soft spot is exactly the one flagged in the stress test: the knob is supposed to vary temporal predictability independently of control-relevance so the failure can be blamed on the objective rather than on the knob itself. If the knob introduces correlation between those two properties, the near-chance result becomes an artifact. The abstract and the stress-test note give no concrete verification that the separation holds, and without that the 2x2 design does not isolate the claimed mechanism. The bisimulation comparison does not fix it. Statistical details on the accuracy numbers and the 2% claim are also missing from what is visible, so robustness is hard to judge.

This is for people building or using JEPA-style pretraining for control and robotics. It flags a practical limitation worth checking. The work is coherent on its own terms and engages the literature directly, so it deserves a serious referee even with the current gaps in the knob verification.

Referee Report

2 major / 2 minor

Summary. The paper claims that joint-embedding predictive (JEPA-style) objectives discard exogenous yet control-relevant features because they optimize temporal predictability rather than control-relevance. It isolates this via a controlled 2x2 design that varies controllability and relevance independently using a predictability knob, compares six objectives (reconstruction, JEPA, action-conditioned JEPA, controllability-based JEPA, inverse dynamics, reward-grounded JEPA), and reports that all reward-free variants leave the feature near chance while the reward-grounded one retains it. The remedy requires only 2% reward-labeled transitions, holds across two environments and latent dimensions 16-1024, and the learned geometry realizes only a small fraction of the class separation predicted by bisimulation theory.

Significance. If the decoupling holds and results replicate, the work identifies a mechanistic limitation in reward-free predictive objectives for control tasks and offers a label-efficient fix. The controlled 2x2 design, cross-environment robustness, and explicit geometry comparison to bisimulation theory are strengths that could inform representation learning in RL. The empirical focus on feature retention under different objectives provides falsifiable predictions about when predictive losses suffice versus when reward grounding is required.

major comments (2)

[Abstract and §3] Abstract and §3: The predictability knob is asserted to 'decouple a feature's temporal predictability from its control-relevance' so that the 2x2 design isolates the claimed failure mode. No direct validation is referenced (e.g., no predictability metric or ablation confirming the exogenous feature remains equally predictable when control-relevant), which is load-bearing: if the knob correlates the two properties, the near-chance accuracy under reward-free objectives could be an artifact rather than evidence for the mechanism. This also affects the 2%-label remedy and six-objective comparison.
[Results (abstract claims)] Results section (implied by abstract claims): The central quantitative claims ('near chance accuracy', 'as little as 2% of reward-labeled transitions recovers the feature', 'persists across latent dimensions from 16 to 1024') lack accompanying statistical details, error bars, or full tables in the provided abstract; without these, the effect size and robustness cannot be assessed, undermining the cross-environment and cross-dimension generalization statements.

minor comments (2)

[Abstract] Abstract is dense with six objectives and multiple claims; a short table summarizing the 2x2 conditions and objective variants would improve readability.
[Abstract] The bisimulation geometry comparison is mentioned but not quantified in the abstract; specifying the exact metric (e.g., class separation ratio) would clarify how 'small fraction' is measured.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our work. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3: The predictability knob is asserted to 'decouple a feature's temporal predictability from its control-relevance' so that the 2x2 design isolates the claimed failure mode. No direct validation is referenced (e.g., no predictability metric or ablation confirming the exogenous feature remains equally predictable when control-relevant), which is load-bearing: if the knob correlates the two properties, the near-chance accuracy under reward-free objectives could be an artifact rather than evidence for the mechanism. This also affects the 2%-label remedy and six-objective comparison.

Authors: We agree that explicit validation of the decoupling is necessary to support the 2x2 design. The current manuscript does not include a direct predictability metric or ablation confirming that the exogenous feature remains equally predictable across control-relevant and control-irrelevant conditions. We will add this validation in the revised §3, reporting temporal predictability (via a held-out linear predictor) for the feature under both settings to confirm the knob achieves the intended decoupling. revision: yes
Referee: [Results (abstract claims)] Results section (implied by abstract claims): The central quantitative claims ('near chance accuracy', 'as little as 2% of reward-labeled transitions recovers the feature', 'persists across latent dimensions from 16 to 1024') lack accompanying statistical details, error bars, or full tables in the provided abstract; without these, the effect size and robustness cannot be assessed, undermining the cross-environment and cross-dimension generalization statements.

Authors: The full manuscript reports these results with error bars (standard deviation across 5 random seeds), effect sizes, and complete tables in the results section and appendix. The abstract provides a high-level summary consistent with standard practice. To improve accessibility, we will revise the abstract to briefly reference the statistical details (e.g., noting low variance across seeds and environments) while retaining conciseness, or add explicit pointers to the supporting tables. revision: partial

Circularity Check

0 steps flagged

Empirical comparative study with no self-referential derivation or fitted predictions

full rationale

The paper is a controlled experimental study comparing six objectives across environments, latent dimensions, and label fractions. No equations, derivations, or self-citations are presented that reduce any claimed result to its inputs by construction. The predictability knob is an experimental manipulation whose validity is an empirical question, not a definitional loop. All central claims rest on measured accuracies rather than renaming, fitting-then-predicting, or load-bearing self-citations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the controlled experimental design and the assumption that the predictability knob achieves the stated decoupling.

axioms (1)

domain assumption The experimental environments and predictability knob allow independent variation of feature controllability and control-relevance
Invoked to justify the 2x2 design and the isolation of the failure mode

pith-pipeline@v0.9.1-grok · 5744 in / 1095 out tokens · 36206 ms · 2026-06-30T07:02:13.300209+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 5 canonical work pages · 3 internal anchors

[1]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture , author =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =
[2]

Assran, Mido and Bardes, Adrien and Fan, David and Garrido, Quentin and others , journal =
[3]

Zhou, Gaoyue and Pan, Hengkai and LeCun, Yann and Pinto, Lerrel , journal =
[4]

International Conference on Learning Representations (ICLR) , year =

Learning Invariant Representations for Reinforcement Learning without Reconstruction , author =. International Conference on Learning Representations (ICLR) , year =
[5]

, booktitle =

Gelada, Carles and Kumar, Saurabh and Buckman, Jacob and Nachum, Ofir and Bellemare, Marc G. , booktitle =. 2019 , note =

2019
[6]

Rudolph, C

Learning Action-based Representations Using Invariance , author =. arXiv preprint arXiv:2403.16369 , year =

work page arXiv
[7]

2026 , howpublished=

Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models , author =. arXiv preprint arXiv:2602.18639 , year =

work page arXiv
[8]

Sensorimotor World Models: Perception for Action via Inverse Dynamics

Sensorimotor World Models: Perception for Action via Inverse Dynamics , author =. arXiv preprint arXiv:2606.20104 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[9]

Critique of World Model

Critique of World Model , author =. arXiv preprint arXiv:2507.05169 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[10]

2022 , note =

Bardes, Adrien and Ponce, Jean and LeCun, Yann , booktitle =. 2022 , note =

2022
[11]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[12]

Representation Learning with Contrastive Predictive Coding

Representation Learning with Contrastive Predictive Coding , author =. arXiv preprint arXiv:1807.03748 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[13]

2022 , note =

A Path Towards Autonomous Machine Intelligence , author =. 2022 , note =

2022
[14]

Conference on Uncertainty in Artificial Intelligence (UAI) , year =

Metrics for Finite Markov Decision Processes , author =. Conference on Uncertainty in Artificial Intelligence (UAI) , year =

[1] [1]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture , author =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

[2] [2]

Assran, Mido and Bardes, Adrien and Fan, David and Garrido, Quentin and others , journal =

[3] [3]

Zhou, Gaoyue and Pan, Hengkai and LeCun, Yann and Pinto, Lerrel , journal =

[4] [4]

International Conference on Learning Representations (ICLR) , year =

Learning Invariant Representations for Reinforcement Learning without Reconstruction , author =. International Conference on Learning Representations (ICLR) , year =

[5] [5]

, booktitle =

Gelada, Carles and Kumar, Saurabh and Buckman, Jacob and Nachum, Ofir and Bellemare, Marc G. , booktitle =. 2019 , note =

2019

[6] [6]

Rudolph, C

Learning Action-based Representations Using Invariance , author =. arXiv preprint arXiv:2403.16369 , year =

work page arXiv

[7] [7]

2026 , howpublished=

Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models , author =. arXiv preprint arXiv:2602.18639 , year =

work page arXiv

[8] [8]

Sensorimotor World Models: Perception for Action via Inverse Dynamics

Sensorimotor World Models: Perception for Action via Inverse Dynamics , author =. arXiv preprint arXiv:2606.20104 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

Critique of World Model

Critique of World Model , author =. arXiv preprint arXiv:2507.05169 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[10] [10]

2022 , note =

Bardes, Adrien and Ponce, Jean and LeCun, Yann , booktitle =. 2022 , note =

2022

[11] [11]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

[12] [12]

Representation Learning with Contrastive Predictive Coding

Representation Learning with Contrastive Predictive Coding , author =. arXiv preprint arXiv:1807.03748 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

2022 , note =

A Path Towards Autonomous Machine Intelligence , author =. 2022 , note =

2022

[14] [14]

Conference on Uncertainty in Artificial Intelligence (UAI) , year =

Metrics for Finite Markov Decision Processes , author =. Conference on Uncertainty in Artificial Intelligence (UAI) , year =