CLIPScore uses a web-pretrained CLIP model to evaluate image captions without references and achieves higher human correlation than CIDEr or SPICE.
Evaluating clip: towards characterization of broader capabilities and downstream implications.arXiv preprint arXiv:2108.02818
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CV 2roles
method 1polarities
use method 1representative citing papers
Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.
citing papers explorer
-
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore uses a web-pretrained CLIP model to evaluate image captions without references and achieves higher human correlation than CIDEr or SPICE.
-
Bias at the End of the Score
Reward models used as quality scorers in text-to-image generation encode demographic biases that cause reward-guided training to sexualize female subjects, reinforce stereotypes, and reduce diversity.