hub

Controlnext: Powerful and effi- cient control for image and video generation

Peng, B · 2024 · arXiv 2408.06070

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

TexADiff integrates a Relative Texture Density Map into diffusion-based super-resolution to address imbalanced textures in remote sensing images, yielding better high-frequency details and downstream task gains.

Immune2V: Image Immunization Against Dual-Stream Image-to-Video Generation

cs.CV · 2026-04-12 · unverdicted · novelty 7.0

Immune2V immunizes images against dual-stream I2V generation by enforcing temporally balanced latent divergence and aligning generative features to a precomputed collapse trajectory, yielding stronger persistent degradation than image-level baselines.

GT-SVJ: Generative-Transformer-Based Self-Supervised Video Judge For Efficient Video Reward Modeling

cs.CV · 2026-02-05 · unverdicted · novelty 7.0

GT-SVJ turns video generative models into self-supervised reward judges via EBM reformulation and contrastive training on controlled synthetic degradations, claiming SOTA on GenAI-Bench and MonteBench with 30K annotations.

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer

cs.CV · 2025-11-28 · unverdicted · novelty 7.0

One-to-All Animation enables alignment-free character animation and image pose transfer via self-supervised outpainting reformulation, reference extraction, hybrid fusion attention, identity-robust pose control, and token replacement for long videos.

Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

cs.CV · 2025-09-27 · unverdicted · novelty 7.0

Vid-Freeze immunizes images by adding perturbations that target attention dynamics in I2V models to enforce temporal freezing and suppress motion synthesis.

PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models

cs.CV · 2025-05-28 · unverdicted · novelty 7.0

PacTure uses view packing and next-scale autoregressive prediction to generate consistent multi-view PBR textures faster than prior sequential or cross-attention methods.

InstanceControl: Controllable Complex Image Generation without Instance Labeling

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

InstanceControl uses VLMs to auto-generate instance masks from text and visual conditions, with adaptive refinement, to enable controllable multi-object image generation without manual labeling.

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

Introduces mesh tokenization to condition DiT-based video diffusion models directly on 3D human meshes for motion control without 2D rendering.

SignVerse-2M: A Two-Million-Clip Pose-Native Universe of 55+ Sign Languages

cs.CV · 2026-05-03 · unverdicted · novelty 6.0

SignVerse-2M provides a 2-million-clip multilingual pose-native dataset for sign language derived from public videos via DWPose preprocessing to enable robust modeling in real-world conditions.

HVG-3D: Bridging Real and Simulation Domains for 3D-Conditional Hand-Object Interaction Video Synthesis

cs.CV · 2026-03-31 · unverdicted · novelty 6.0

HVG-3D uses a 3D-aware diffusion architecture with ControlNet to synthesize high-fidelity hand-object interaction videos from 3D control signals, achieving state-of-the-art spatial fidelity and temporal coherence on the TASTE-Rob dataset.

VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification

cs.CV · 2025-12-10 · unverdicted · novelty 6.0

VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.

3D Scene-Adaptive Trajectory-Controllable Human Image Animation with Camera Movement

cs.CV · 2026-06-29 · unverdicted · novelty 5.0 · 2 refs

Presents a scene-adaptive 3D human image animation framework using ground-adaptive motion retargeting and viewpoint-adaptive latent fusion to control human and camera trajectories, claiming improvements on two benchmarks.

EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generation

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

EasyVFX decouples VFX generation via frequency-aware Mixture-of-Experts and test-time training to achieve realistic effects with limited resources.

DepthPilot: From Controllability to Interpretability in Colonoscopy Video Generation

cs.CV · 2026-04-29 · unverdicted · novelty 5.0

DepthPilot generates physically consistent and clinically interpretable colonoscopy videos by injecting depth priors into diffusion models through parameter-efficient fine-tuning and replacing linear denoising weights with adaptive splines.

ROPA: Synthetic Robot Pose Generation for RGB-D Bimanual Data Augmentation

cs.RO · 2025-09-23 · unverdicted · novelty 5.0

ROPA augments bimanual imitation learning datasets by generating synthetic RGB-D observations and actions via fine-tuned diffusion models with physical consistency constraints.

Open-Sora Plan: Open-Source Large Video Generation Model

cs.CV · 2024-11-28 · unverdicted · novelty 4.0

Open-Sora Plan presents an open-source large video generation model that combines a Wavelet-Flow VAE, Joint Image-Video Skiparse Denoiser, and multi-dimensional data curation to achieve high-quality video outputs with public code and weights.

citing papers explorer

Showing 5 of 5 citing papers after filters.

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer cs.CV · 2025-11-28 · unverdicted · none · ref 34
One-to-All Animation enables alignment-free character animation and image pose transfer via self-supervised outpainting reformulation, reference extraction, hybrid fusion attention, identity-robust pose control, and token replacement for long videos.
Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing cs.CV · 2025-09-27 · unverdicted · none · ref 11
Vid-Freeze immunizes images by adding perturbations that target attention dynamics in I2V models to enforce temporal freezing and suppress motion synthesis.
PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models cs.CV · 2025-05-28 · unverdicted · none · ref 58
PacTure uses view packing and next-scale autoregressive prediction to generate consistent multi-view PBR textures faster than prior sequential or cross-attention methods.
VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification cs.CV · 2025-12-10 · unverdicted · none · ref 58
VHOI densifies sparse trajectories into color-encoded HOI mask sequences and conditions a fine-tuned video diffusion model on them to produce controllable human-object interaction videos, including full navigation sequences.
ROPA: Synthetic Robot Pose Generation for RGB-D Bimanual Data Augmentation cs.RO · 2025-09-23 · unverdicted · none · ref 68
ROPA augments bimanual imitation learning datasets by generating synthetic RGB-D observations and actions via fine-tuned diffusion models with physical consistency constraints.

Controlnext: Powerful and effi- cient control for image and video generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer