pith. sign in

Evaluating clip: towards characterization of broader capabilities and downstream implications.arXiv preprint arXiv:2108.02818

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

method 1

citation-polarity summary

fields

cs.CV 3 cs.AI 1

years

2026 3 2021 1

roles

method 1

polarities

use method 1

representative citing papers

An Attribute-Based Measure of Video Complexity

cs.CV · 2026-05-30 · unverdicted · novelty 7.0

VideoABC estimates video-LLM failure probability via low-dimensional attribute projection, dual quantization (k-means plus lattice), and psychophysics-inspired synthetic data.

Bias at the End of the Score

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.

citing papers explorer

Showing 4 of 4 citing papers.

  • CLIPScore: A Reference-free Evaluation Metric for Image Captioning cs.CV · 2021-04-18 · conditional · none · ref 2

    CLIPScore uses a web-pretrained CLIP model to evaluate image captions without references and achieves higher human correlation than CIDEr or SPICE.

  • An Attribute-Based Measure of Video Complexity cs.CV · 2026-05-30 · unverdicted · none · ref 2

    VideoABC estimates video-LLM failure probability via low-dimensional attribute projection, dual quantization (k-means plus lattice), and psychophysics-inspired synthetic data.

  • Bias at the End of the Score cs.CV · 2026-04-14 · unverdicted · none · ref 1

    Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.

  • ComMem: Complementary Memory Systems for Test-Time Adaptation of Vision-Language Models cs.AI · 2026-06-27 · unverdicted · none · ref 2

    ComMem proposes complementary fast visual cache and slow textual prototype memories for test-time adaptation of VLMs, claiming superior performance on 15 benchmarks under distribution shifts.