Advances in neural information pro- cessing systems35, 16344–16359 (2022)

Dao, T · 2022

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

DiSI disentangles stochastic interpolants into separate generation and regression paths, allowing controllable transitions between regression and generative image restoration with a unified few-step sampler.

Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs

cs.LG · 2026-04-22 · unverdicted · novelty 7.0

Sink-Token-aware Pruning (SToP) suppresses semantically uninformative sink tokens during visual token pruning in Video LLMs, boosting fine-grained performance even at 90% pruning rates across hallucination, reasoning, and MCQA benchmarks.

StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

StreamGVE enables high-quality training-free video editing by converting the task to noise-to-data streaming generation with dual-branch fast sampling, self-attention bridges, cross-attention grounding, source-oriented guidance, and visual prompting.

Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Head Forcing assigns tailored KV cache strategies to local, anchor, and memory attention heads plus head-wise RoPE re-encoding to extend autoregressive video generation from seconds to minutes without training.

OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

OneDrive unifies heterogeneous decoding in a single VLM transformer decoder for end-to-end driving, achieving 0.28 L2 error and 0.18 collision rate on nuScenes plus 86.8 PDMS on NAVSIM.

Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation

cs.CV · 2026-04-11 · conditional · novelty 6.0

Hybrid Forcing combines linear temporal attention for long-range retention, block-sparse attention for efficiency, and decoupled distillation to achieve real-time unbounded 832x480 streaming video generation at 29.5 FPS.

citing papers explorer

Showing 6 of 6 citing papers.

Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration cs.CV · 2026-05-20 · unverdicted · none · ref 15
DiSI disentangles stochastic interpolants into separate generation and regression paths, allowing controllable transitions between regression and generative image restoration with a unified few-step sampler.
Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs cs.LG · 2026-04-22 · unverdicted · none · ref 6
Sink-Token-aware Pruning (SToP) suppresses semantically uninformative sink tokens during visual token pruning in Video LLMs, boosting fine-grained performance even at 90% pruning rates across hallucination, reasoning, and MCQA benchmarks.
StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation cs.CV · 2026-05-20 · unverdicted · none · ref 13
StreamGVE enables high-quality training-free video editing by converting the task to noise-to-data streaming generation with dual-branch fast sampling, self-attention bridges, cross-attention grounding, source-oriented guidance, and visual prompting.
Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity cs.CV · 2026-05-14 · unverdicted · none · ref 12
Head Forcing assigns tailored KV cache strategies to local, anchor, and memory attention heads plus head-wise RoPE re-encoding to extend autoregressive video generation from seconds to minutes without training.
OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models cs.CV · 2026-04-20 · unverdicted · none · ref 14
OneDrive unifies heterogeneous decoding in a single VLM transformer decoder for end-to-end driving, achieving 0.28 L2 error and 0.18 collision rate on nuScenes plus 86.8 PDMS on NAVSIM.
Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation cs.CV · 2026-04-11 · conditional · none · ref 8
Hybrid Forcing combines linear temporal attention for long-range retention, block-sparse attention for efficiency, and decoupled distillation to achieve real-time unbounded 832x480 streaming video generation at 29.5 FPS.

Advances in neural information pro- cessing systems35, 16344–16359 (2022)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer