hub Baseline reference

IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu · 2025 · arXiv 2025.363389

Baseline reference. 50% of citing Pith papers use this work as a benchmark or comparison.

10 Pith papers citing it

Baseline 50% of classified citations

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 dataset 3

citation-polarity summary

background 3 use dataset 3

representative citing papers

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.

Efficient Video Diffusion Models: Advancements and Challenges

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.

VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

VISTA introduces a new synthetic triplet dataset and diffusion-transformer framework with style adapter that jointly models style, content, and motion to achieve state-of-the-art video style transfer.

Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Head Forcing assigns tailored KV cache strategies to local, anchor, and memory attention heads plus head-wise RoPE re-encoding to extend autoregressive video generation from seconds to minutes without training.

SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation

cs.CV · 2026-05-07 · unverdicted · novelty 6.0

SwiftI2V achieves comparable 2K I2V quality to end-to-end models on VBench-I2V while cutting GPU time by 202x through low-resolution motion planning followed by strongly image-conditioned segment-wise high-resolution synthesis.

Investigating Ethical Data Communication with Purrsuasion: An Educational Game about Negotiated Data Disclosure

cs.HC · 2026-04-06 · unverdicted · novelty 6.0

Purrsuasion is a negotiation game that surfaces satisficing and intent-attribution difficulties when students practice ethical data disclosure under real constraints.

Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

Salt improves low-step video generation quality by adding endpoint-consistent regularization to distribution matching distillation and using cache-conditioned feature alignment for autoregressive models.

A Spectral Framework for Multi-Scale Nonlinear Dimensionality Reduction

cs.LG · 2026-04-02 · unverdicted · novelty 6.0

A spectral framework for nonlinear DR uses spectral bases plus cross-entropy optimization to create multi-scale embeddings that preserve both global manifold geometry and local neighborhoods while supporting graph-frequency analysis.

Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion

cs.CV · 2026-02-08 · unverdicted · novelty 6.0

Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.

Focused Forcing: Content-Aware Per-Frame KV Selection for Efficient Autoregressive Video Diffusion

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.

citing papers explorer

Showing 10 of 10 citing papers.

CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing cs.CV · 2026-05-19 · unverdicted · none · ref 18
CutVerse benchmark evaluates GUI agents on 186 complex media post-production tasks in seven apps and reports 36% success rate for existing models.
Efficient Video Diffusion Models: Advancements and Challenges cs.CV · 2026-04-17 · unverdicted · none · ref 58
A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers cs.CV · 2026-05-17 · unverdicted · none · ref 17
VISTA introduces a new synthetic triplet dataset and diffusion-transformer framework with style adapter that jointly models style, content, and motion to achieve state-of-the-art video style transfer.
Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity cs.CV · 2026-05-14 · unverdicted · none · ref 27
Head Forcing assigns tailored KV cache strategies to local, anchor, and memory attention heads plus head-wise RoPE re-encoding to extend autoregressive video generation from seconds to minutes without training.
SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation cs.CV · 2026-05-07 · unverdicted · none · ref 8
SwiftI2V achieves comparable 2K I2V quality to end-to-end models on VBench-I2V while cutting GPU time by 202x through low-resolution motion planning followed by strongly image-conditioned segment-wise high-resolution synthesis.
Investigating Ethical Data Communication with Purrsuasion: An Educational Game about Negotiated Data Disclosure cs.HC · 2026-04-06 · unverdicted · none · ref 13
Purrsuasion is a negotiation game that surfaces satisficing and intent-attribution difficulties when students practice ethical data disclosure under real constraints.
Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation cs.CV · 2026-04-03 · unverdicted · none · ref 18
Salt improves low-step video generation quality by adding endpoint-consistent regularization to distribution matching distillation and using cache-conditioned feature alignment for autoregressive models.
A Spectral Framework for Multi-Scale Nonlinear Dimensionality Reduction cs.LG · 2026-04-02 · unverdicted · none · ref 41
A spectral framework for nonlinear DR uses spectral bases plus cross-entropy optimization to create multi-scale embeddings that preserve both global manifold geometry and local neighborhoods while supporting graph-frequency analysis.
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion cs.CV · 2026-02-08 · unverdicted · none · ref 41
Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.
Focused Forcing: Content-Aware Per-Frame KV Selection for Efficient Autoregressive Video Diffusion cs.CV · 2026-05-18 · unverdicted · none · ref 22
Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.

IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer