pith. sign in

Koala-36m: A large-scale video dataset improving consistency between fine-grained conditions and video content

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

dataset 1

citation-polarity summary

fields

cs.CV 2 cs.LG 1

years

2026 2 2025 1

verdicts

UNVERDICTED 3

roles

dataset 1

polarities

use dataset 1

representative citing papers

Streaming Video Instruction Tuning

cs.CV · 2025-12-24 · unverdicted · novelty 6.0

Streamo is a streaming video LLM trained end-to-end on the new Streamo-Instruct-465K dataset that unifies multiple real-time video tasks with claimed strong temporal reasoning and generalization.

VDCook:DIY video data cook your MLLMs

cs.LG · 2026-03-04 · unverdicted · novelty 5.0

VDCook is an automated, self-evolving platform for generating in-domain video datasets for MLLMs via natural language queries, retrieval-synthesis, and multi-dimensional metadata.

citing papers explorer

Showing 3 of 3 citing papers.

  • SyncDPO: Enhancing Temporal Synchronization in Video-Audio Joint Generation via Preference Learning cs.CV · 2026-05-12 · unverdicted · none · ref 44

    SyncDPO improves temporal synchronization in video-audio joint generation using DPO with efficient on-the-fly negative sample construction and curriculum learning.

  • Streaming Video Instruction Tuning cs.CV · 2025-12-24 · unverdicted · none · ref 33

    Streamo is a streaming video LLM trained end-to-end on the new Streamo-Instruct-465K dataset that unifies multiple real-time video tasks with claimed strong temporal reasoning and generalization.

  • VDCook:DIY video data cook your MLLMs cs.LG · 2026-03-04 · unverdicted · none · ref 22

    VDCook is an automated, self-evolving platform for generating in-domain video datasets for MLLMs via natural language queries, retrieval-synthesis, and multi-dimensional metadata.