LAION-C supplies six novel corruptions that stay OOD for web-scale training sets and demonstrates that leading models now rival or exceed human robustness on them.
The role of imagenet classes in fr \’echet inception distance
3 Pith papers cite this work. Polarity classification is still indexing.
years
2025 3representative citing papers
BadRDM is a backdoor attack on retrieval-augmented diffusion models that poisons the retrieval database with toxicity surrogates and uses multimodal contrastive learning to force toxic generations from text triggers while preserving benign performance.
LLM safety evaluations are hindered by noise in dataset curation, automated red-teaming, response generation, and LLM-judge evaluation, making fair comparisons difficult and slowing progress.
citing papers explorer
-
LAION-C: An Out-of-Distribution Benchmark for Web-Scale Vision Models
LAION-C supplies six novel corruptions that stay OOD for web-scale training sets and demonstrates that leading models now rival or exceed human robustness on them.
-
Retrievals Can Be Detrimental: Unveiling the Backdoor Vulnerability of Retrieval-Augmented Diffusion Models
BadRDM is a backdoor attack on retrieval-augmented diffusion models that poisons the retrieval database with toxicity surrogates and uses multimodal contrastive learning to force toxic generations from text triggers while preserving benign performance.
-
LLM-Safety Evaluations Lack Robustness
LLM safety evaluations are hindered by noise in dataset curation, automated red-teaming, response generation, and LLM-judge evaluation, making fair comparisons difficult and slowing progress.