Masked Logit Nudging aligns visual autoregressive model logits with source token maps under target prompts inside cross-attention masks, delivering top image editing results on PIE benchmarks and strong reconstructions on COCO and OpenImages while running faster than diffusion approaches.
Image and video tokenization with binary spherical quantization
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models
Masked Logit Nudging aligns visual autoregressive model logits with source token maps under target prompts inside cross-attention masks, delivering top image editing results on PIE benchmarks and strong reconstructions on COCO and OpenImages while running faster than diffusion approaches.