pith. sign in

Miradata: A large-scale video dataset with long durations and structured captions.Advances in Neural Information Processing Systems, 37:48955–48970

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CV 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

VDCook:DIY video data cook your MLLMs

cs.LG · 2026-03-04 · unverdicted · novelty 5.0

VDCook is an automated, self-evolving platform for generating in-domain video datasets for MLLMs via natural language queries, retrieval-synthesis, and multi-dimensional metadata.

citing papers explorer

Showing 2 of 2 citing papers.

  • SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer cs.CV · 2026-05-14 · unverdicted · none · ref 88

    SANA-WM is a 2.6B-parameter efficient world model that synthesizes minute-scale 720p videos with 6-DoF camera control, trained on 213K public clips in 15 days on 64 H100s and runnable on single GPUs at 36x higher throughput than prior open baselines.

  • VDCook:DIY video data cook your MLLMs cs.LG · 2026-03-04 · unverdicted · none · ref 10

    VDCook is an automated, self-evolving platform for generating in-domain video datasets for MLLMs via natural language queries, retrieval-synthesis, and multi-dimensional metadata.