Demo-JEPA enables one-shot cross-embodiment imitation by mapping visual demonstrations to shared latent future trajectories that serve as subgoals for the target agent's own forward dynamics planning.
Masked autoencoders are scalable vision learners, 2021
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
EAM is a DiT-based blind super-resolution model that uses a triple-flow Ψ-DiT block, progressive masked image modeling, and in-context subject-aware prompting to reach state-of-the-art quantitative and visual results on standard datasets.
citing papers explorer
-
Demo-JEPA: Joint-Embedding Predictive Architecture for One-shot Cross-Embodiment Imitation
Demo-JEPA enables one-shot cross-embodiment imitation by mapping visual demonstrations to shared latent future trajectories that serve as subgoals for the target agent's own forward dynamics planning.
-
EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution
EAM is a DiT-based blind super-resolution model that uses a triple-flow Ψ-DiT block, progressive masked image modeling, and in-context subject-aware prompting to reach state-of-the-art quantitative and visual results on standard datasets.