Clipscore: A reference-free evaluation metric for image captioning

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, Yejin Choi · 2022

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

method 1 other 1

citation-polarity summary

unclear 1 use method 1

representative citing papers

Mind the Gap: Geometrically Accurate Generative Reconstruction from Disjoint Views

cs.CV · 2026-05-08 · unverdicted · novelty 8.0

GLADOS reconstructs 3D geometry from disjoint views by generating intermediate perspectives, performing robust coarse alignment that tolerates generative inconsistencies, and iteratively expanding context for consistency.

Knowledge Visualization: A Benchmark and Method for Knowledge-Intensive Text-to-Image Generation

cs.CV · 2026-04-24 · conditional · novelty 7.0

KVBench reveals major gaps in current T2I models for knowledge-intensive tasks, and KE-Check narrows the gap between open- and closed-source models by adding structured knowledge and enforcing constraints.

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

Auto-Rubric as Reward externalizes VLM preferences into structured rubrics and applies Rubric Policy Optimization to create more reliable binary rewards for multimodal generation, outperforming pairwise models on text-to-image and editing benchmarks.

Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation

cs.CV · 2024-02-27 · unverdicted · novelty 4.0

Optimizing the noise schedule, preparing a balanced bucketed dataset, and aligning outputs with human preferences enables Playground v2.5 to reach state-of-the-art aesthetic quality across aspect ratios.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Mind the Gap: Geometrically Accurate Generative Reconstruction from Disjoint Views cs.CV · 2026-05-08 · unverdicted · none · ref 42
GLADOS reconstructs 3D geometry from disjoint views by generating intermediate perspectives, performing robust coarse alignment that tolerates generative inconsistencies, and iteratively expanding context for consistency.

Clipscore: A reference-free evaluation metric for image captioning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer