GuardReasoner-Omni: A Reasoning-based Multi-modal Guardrail for Text, Image, Video, and Audio
read the original abstract
We present GuardReasoner-Omni, a reasoning-based guardrail model designed to moderate text, image, video, and audio data. First, we construct a comprehensive training corpus comprising 181k samples spanning these four modalities. Our training pipeline follows a two-stage paradigm to incentivize the model to deliberate before making decisions: (1) conducting SFT to cold-start the model with explicit reasoning capabilities and structural adherence; and (2) performing RL with a concise correctness reward to preserve accurate reasoning while suppressing redundant generation. We release a suite of models scaled at 3B and 7B parameters. Extensive experiments demonstrate that GuardReasoner-Omni achieves superior performance compared to existing state-of-the-art baselines across various guardrail benchmarks.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening
SafeLens presents a fast-and-slow video guardrail framework that filters the SafeWatch dataset to 2.4% and adds Chain-of-Thought traces to achieve state-of-the-art moderation performance at reduced inference cost.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.