Masked-position MLM plus JEPA latent prediction outperforms MLM-only pretraining on 10-11 of 16 downstream tasks for 35M-150M protein models while JEPA alone fails.
Llm-jepa: Large language models meet joint embedding predictive architectures
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 2polarities
background 2representative citing papers
Applying STP at consecutive semantic reasoning steps achieves 168x more accurate multi-step latent prediction on ProcessBench than frozen baselines, with trajectories forming smooth curves best captured by non-linear predictors.
A five-phase co-training framework enables stable JEPA pretraining on EHR trajectories, producing converging latent rollouts and higher multi-task AUROC than baselines on MIMIC-IV ICU data.
Imagining in 360° decouples visual search into a single-step probabilistic semantic layout predictor and an actor, removing the need for multi-turn CoT reasoning and trajectory annotations while improving efficiency in 360° environments.
An empirical audit of 22 JEPA-style training auxiliaries on Llama-3.2-1B fine-tuning for regex generation finds no statistically significant task improvement after multiple-testing correction, even when auxiliaries visibly alter hidden-state geometry.
citing papers explorer
-
ProteinJEPA: Latent prediction complements protein language models
Masked-position MLM plus JEPA latent prediction outperforms MLM-only pretraining on 10-11 of 16 downstream tasks for 35M-150M protein models while JEPA alone fails.
-
Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling
Applying STP at consecutive semantic reasoning steps achieves 168x more accurate multi-step latent prediction on ProcessBench than frozen baselines, with trajectories forming smooth curves best captured by non-linear predictors.
-
Clin-JEPA: A Multi-Phase Co-Training Framework for Joint-Embedding Predictive Pretraining on EHR Patient Trajectories
A five-phase co-training framework enables stable JEPA pretraining on EHR trajectories, producing converging latent rollouts and higher multi-task AUROC than baselines on MIMIC-IV ICU data.
-
Beyond Thinking: Imagining in 360$^\circ$ for Humanoid Visual Search
Imagining in 360° decouples visual search into a single-step probabilistic semantic layout predictor and an actor, removing the need for multi-turn CoT reasoning and trajectory annotations while improving efficiency in 360° environments.
-
Representation Without Reward: A JEPA Audit for LLM Fine-Tuning
An empirical audit of 22 JEPA-style training auxiliaries on Llama-3.2-1B fine-tuning for regex generation finds no statistically significant task improvement after multiple-testing correction, even when auxiliaries visibly alter hidden-state geometry.