UNIVID generates policy-aware captions for video moderation, reducing violation leakage by 42.7% and overkill rate by 37.0% while replacing over 1,000 policy-specific models with a single backbone.
arXiv preprint arXiv:2505.23386 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.MM 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
UNIVID: Unified Vision-Language Model for Video Moderation
UNIVID generates policy-aware captions for video moderation, reducing violation leakage by 42.7% and overkill rate by 37.0% while replacing over 1,000 policy-specific models with a single backbone.