ORBIS uses output-guided token reduction and DATM to achieve 2x higher token reduction than AsymRnR, with up to 4.5x speedup and 79.3% energy savings versus A100 GPU for video DiT models.
Efros, Eli Shechtman, and Oliver Wang
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4representative citing papers
PaintCopilot models painting as an open-ended autoregressive process that predicts coherent brushstrokes from partial canvas observations using a ViT target predictor, flow-matching stroke generator, and VAE region sampler.
PostureObjectStitch generates assembly-aware anomaly images by decoupling multi-view features into high-frequency, texture and RGB components, modulating them temporally in a diffusion model, and applying conditional loss plus geometric priors to preserve correct component relationships.
Any3DAvatar reconstructs full-head 3D Gaussian avatars from one image via one-step denoising on a Plücker-aware scaffold plus auxiliary view supervision, beating prior single-image methods on fidelity while running substantially faster.
citing papers explorer
-
ORBIS: Output-Guided Token Reduction with Distribution-Aware Matching for Video Diffusion Acceleration
ORBIS uses output-guided token reduction and DATM to achieve 2x higher token reduction than AsymRnR, with up to 4.5x speedup and 79.3% energy savings versus A100 GPU for video DiT models.
-
PaintCopilot: Modeling Painting as Autonomous Artistic Continuation
PaintCopilot models painting as an open-ended autoregressive process that predicts coherent brushstrokes from partial canvas observations using a ViT target predictor, flow-matching stroke generator, and VAE region sampler.
-
PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios
PostureObjectStitch generates assembly-aware anomaly images by decoupling multi-view features into high-frequency, texture and RGB components, modulating them temporally in a diffusion model, and applying conditional loss plus geometric priors to preserve correct component relationships.
-
Any3DAvatar: Fast and High-Quality Full-Head 3D Avatar Reconstruction from Single Portrait Image
Any3DAvatar reconstructs full-head 3D Gaussian avatars from one image via one-step denoising on a Plücker-aware scaffold plus auxiliary view supervision, beating prior single-image methods on fidelity while running substantially faster.