pith. sign in

hub Canonical reference

Internlm-xcomposer2-4khd: A pioneer- ing large vision-language model handling resolutions from 336 pixels to 4k hd

Canonical reference. 88% of citing Pith papers cite this work as background.

11 Pith papers citing it
Background 88% of classified citations

hub tools

citation-role summary

background 8

citation-polarity summary

fields

cs.CV 9 cs.CL 2

years

2025 1 2024 10

roles

background 8

polarities

background 7 unclear 1

representative citing papers

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

cs.CV · 2024-08-03 · conditional · novelty 5.0

MiniCPM-Llama3-V 2.5 delivers GPT-4V-level multimodal performance on phones through architecture, pretraining, and alignment optimizations.

PaliGemma: A versatile 3B VLM for transfer

cs.CV · 2024-07-10 · unverdicted · novelty 4.0

PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.

citing papers explorer

Showing 11 of 11 citing papers.