DLLM-JEPA pairs JEPA with masked diffusion LMs to enable single-pass self-supervised fine-tuning that improves task accuracy, lowers held-out loss, and preserves base-model performance.
Seeds are matched between baseline and DLLM-JEPA within each task / configuration cell so that any difference is attributable to the objective
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models
DLLM-JEPA pairs JEPA with masked diffusion LMs to enable single-pass self-supervised fine-tuning that improves task accuracy, lowers held-out loss, and preserves base-model performance.