hub

Advances in neural information processing systems , volume=

Laion-5b: An open large-scale dataset for training next generation image-text models , author=

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

browse 12 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

RevealLayer decomposes natural images into multiple RGBA layers using diffusion models with region-aware attention, occlusion-guided adaptation, and a composite loss, outperforming prior methods on a new benchmark dataset.

Progressive Photorealistic Simplification

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

Progressive semantic image simplification uses VLMs and a verifier to iteratively remove and inpaint scene elements while preserving photorealism, distilled into an image-to-video model for direct sequence prediction.

VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation

cs.CV · 2026-05-02 · unverdicted · novelty 7.0

VAnim creates open-domain text-to-SVG animations via sparse state updates on a persistent DOM tree, identification-first planning, and rendering-aware RL with a new 134k-example benchmark.

Watch Your Step: Information Injection in Diffusion Models via Shadow Timestep Embedding

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

Timestep embeddings in diffusion models function as a separable side channel that can carry dedicated information for adversarial injection or detection.

Long-Text-to-Image Generation via Compositional Prompt Decomposition

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models while generalizing better to prompts over 500 tokens.

SafeDiffusion-R1: Online Reward Steering for Safe Diffusion Post-Training

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

SafeDiffusion-R1 uses online GRPO with CLIP embedding steering to cut inappropriate content from 48.9% to 18.07% and nudity detections from 646 to 15 in diffusion models while raising GenEval scores from 42.08% to 47.83% and generalizing across seven harm categories without supervised pairs or extra

AdaEraser: Training-Free Object Removal via Adaptive Attention Suppression

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

AdaEraser introduces token-wise adaptive attention suppression in diffusion denoising to enable high-quality training-free object removal by modulating suppression according to evolving self-attention maps.

Threshold-Guided Optimization for Visual Generative Models

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

A threshold-guided alignment method lets visual generative models be optimized directly from scalar human ratings instead of requiring paired preference data.

Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes

cs.CV · 2026-05-22 · unverdicted · novelty 5.0

Introduces dual pose-image representation, cross-modal alignment, and iterative construction to improve prompt alignment and diversity in multi-person text-to-image generation.

Mutual Enhancement Between Global Tokens and Patch Tokens: From Theory to Practice

cs.CV · 2026-05-11 · unverdicted · novelty 5.0

TaTok is a theoretically grounded adaptive tokenization method that uses global tokens and cumulative conditional entropy filtering to reduce redundancy while improving reconstruction quality over fixed-rate patch tokenization.

Venus-DeFakerOne: Unified Fake Image Detection & Localization

cs.CV · 2026-05-13

Possibilistic Predictive Uncertainty for Deep Learning

cs.LG · 2026-05-01

citing papers explorer

Showing 12 of 12 citing papers.

RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition cs.CV · 2026-05-12 · unverdicted · none · ref 73
RevealLayer decomposes natural images into multiple RGBA layers using diffusion models with region-aware attention, occlusion-guided adaptation, and a composite loss, outperforming prior methods on a new benchmark dataset.
Progressive Photorealistic Simplification cs.CV · 2026-05-11 · unverdicted · none · ref 19
Progressive semantic image simplification uses VLMs and a verifier to iteratively remove and inpaint scene elements while preserving photorealism, distilled into an image-to-video model for direct sequence prediction.
VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation cs.CV · 2026-05-02 · unverdicted · none · ref 28
VAnim creates open-domain text-to-SVG animations via sparse state updates on a persistent DOM tree, identification-first planning, and rendering-aware RL with a new 134k-example benchmark.
Watch Your Step: Information Injection in Diffusion Models via Shadow Timestep Embedding cs.LG · 2026-05-01 · unverdicted · none · ref 25
Timestep embeddings in diffusion models function as a separable side channel that can carry dedicated information for adversarial injection or detection.
Long-Text-to-Image Generation via Compositional Prompt Decomposition cs.CV · 2026-04-20 · unverdicted · none · ref 51
PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models while generalizing better to prompts over 500 tokens.
SafeDiffusion-R1: Online Reward Steering for Safe Diffusion Post-Training cs.CV · 2026-05-18 · unverdicted · none · ref 4
SafeDiffusion-R1 uses online GRPO with CLIP embedding steering to cut inappropriate content from 48.9% to 18.07% and nudity detections from 646 to 15 in diffusion models while raising GenEval scores from 42.08% to 47.83% and generalizing across seven harm categories without supervised pairs or extra
AdaEraser: Training-Free Object Removal via Adaptive Attention Suppression cs.CV · 2026-05-15 · unverdicted · none · ref 50
AdaEraser introduces token-wise adaptive attention suppression in diffusion denoising to enable high-quality training-free object removal by modulating suppression according to evolving self-attention maps.
Threshold-Guided Optimization for Visual Generative Models cs.LG · 2026-05-06 · unverdicted · none · ref 5
A threshold-guided alignment method lets visual generative models be optimized directly from scalar human ratings instead of requiring paired preference data.
Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes cs.CV · 2026-05-22 · unverdicted · none · ref 74
Introduces dual pose-image representation, cross-modal alignment, and iterative construction to improve prompt alignment and diversity in multi-person text-to-image generation.
Mutual Enhancement Between Global Tokens and Patch Tokens: From Theory to Practice cs.CV · 2026-05-11 · unverdicted · none · ref 103
TaTok is a theoretically grounded adaptive tokenization method that uses global tokens and cumulative conditional entropy filtering to reduce redundancy while improving reconstruction quality over fixed-rate patch tokenization.
Venus-DeFakerOne: Unified Fake Image Detection & Localization cs.CV · 2026-05-13 · unreviewed · ref 236
Possibilistic Predictive Uncertainty for Deep Learning cs.LG · 2026-05-01 · unreviewed · ref 35

Advances in neural information processing systems , volume=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer