pith. sign in

Gpt-4v (ision) as a general- ist evaluator for vision-language tasks

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 2 2024 3

roles

background 1

polarities

background 1

representative citing papers

Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

Masked Logit Nudging aligns visual autoregressive model logits with source token maps under target prompts inside cross-attention masks, delivering top image editing results on PIE benchmarks and strong reconstructions on COCO and OpenImages while running faster than diffusion approaches.

GPT-4V(ision) is a Generalist Web Agent, if Grounded

cs.IR · 2024-01-03 · conditional · novelty 6.0

GPT-4V achieves 51.1% success on live web tasks as a generalist agent when plans are manually grounded, outperforming text-only models, but automatic grounding lags far behind oracle performance.

citing papers explorer

Showing 5 of 5 citing papers.