pith. sign in

BRA VE: Broadening the visual encoding of vision-language models

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CV 2

years

2026 1 2024 1

verdicts

UNVERDICTED 2

clear filters

representative citing papers

Beyond Encoder Accumulation: Measuring Encoder Roles in Multi-Encoder VLMs

cs.CV · 2026-06-02 · unverdicted · novelty 6.0

Retraining all 31 subsets of five vision encoders shows Capacity and Necessity are distinct, pre-projector effective rank predicts residual performance at fixed parameter count, and high-Capacity plus adaptive complement pairs match the full five-encoder model.

PaliGemma 2: A Family of Versatile VLMs for Transfer

cs.CV · 2024-12-04 · unverdicted · novelty 4.0

PaliGemma 2 is a family of vision-language models that achieves state-of-the-art results on transfer tasks like table structure recognition and radiography report generation by combining SigLIP with Gemma 2 models at various sizes and resolutions.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Beyond Encoder Accumulation: Measuring Encoder Roles in Multi-Encoder VLMs cs.CV · 2026-06-02 · unverdicted · none · ref 12

    Retraining all 31 subsets of five vision encoders shows Capacity and Necessity are distinct, pre-projector effective rank predicts residual performance at fixed parameter count, and high-Capacity plus adaptive complement pairs match the full five-encoder model.

  • PaliGemma 2: A Family of Versatile VLMs for Transfer cs.CV · 2024-12-04 · unverdicted · none · ref 34

    PaliGemma 2 is a family of vision-language models that achieves state-of-the-art results on transfer tasks like table structure recognition and radiography report generation by combining SigLIP with Gemma 2 models at various sizes and resolutions.