Gary Chan

Jierun Chen, Fangyun Wei, Jinjing Zhao, Sizhe Song, Bohuai Wu, Zhuoxuan Peng, S-H Gary Chan, Hongyang Zhang · 2024 · arXiv 2406.16866

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

cs.CV · 2025-01-07 · conditional · novelty 6.0

Sa2VA unifies SAM-2 segmentation with MLLM reasoning into a single model for referring segmentation and conversation on images and videos, supported by a new 72k-expression Ref-SAV dataset.

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

cs.CV · 2025-09-28 · unverdicted · novelty 5.0

LLaVA-OneVision-1.5 provides open datasets, code, and models that match or exceed closed competitors on 27 benchmarks at low cost through curated data and efficient training.

citing papers explorer

Showing 2 of 2 citing papers.

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos cs.CV · 2025-01-07 · conditional · none · ref 9
Sa2VA unifies SAM-2 segmentation with MLLM reasoning into a single model for referring segmentation and conversation on images and videos, supported by a new 72k-expression Ref-SAV dataset.
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training cs.CV · 2025-09-28 · unverdicted · none · ref 2
LLaVA-OneVision-1.5 provides open datasets, code, and models that match or exceed closed competitors on 27 benchmarks at low cost through curated data and efficient training.

Gary Chan

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer