pith. sign in

Llava-uhd v2: an mllm integrating high-resolution feature pyramid via hierarchical window transformer

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 2 dataset 1

citation-polarity summary

fields

cs.CV 3

years

2026 1 2025 2

representative citing papers

Video-R1: Reinforcing Video Reasoning in MLLMs

cs.CV · 2025-03-27 · conditional · novelty 7.0

Video-R1 uses temporal-aware RL and mixed datasets to boost video reasoning in MLLMs, with a 7B model reaching 37.1% on VSI-Bench and surpassing GPT-4o.

citing papers explorer

Showing 3 of 3 citing papers.