OutSafe-Bench supplies the first large-scale four-modality safety dataset and evaluation framework that exposes persistent unsafe outputs in nine leading multimodal LLMs.
This subset covers balanced distributions across nine risk categories and four modalities (text, image, audio and video), with detailed data shown in the table 4
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models
OutSafe-Bench supplies the first large-scale four-modality safety dataset and evaluation framework that exposes persistent unsafe outputs in nine leading multimodal LLMs.