ARGUS uses a Prosecutor-Defender-Umpire multi-agent setup plus RAG and chain-of-thought rewards to adapt ad policy enforcement to new regulations using minimal fresh labels.
arXiv preprint arXiv:2508.03296 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
UniNote proposes a two-stage trained unified embedding model (contrastive SFT then RL) for multimodal I2I retrieval that claims SOTA results and was deployed at Xiaohongshu with MRL for improved quality and efficiency.
citing papers explorer
-
ARGUS: Policy-Adaptive Ad Governance via Evolving Reinforcement with Adversarial Umpiring
ARGUS uses a Prosecutor-Defender-Umpire multi-agent setup plus RAG and chain-of-thought rewards to adapt ad policy enforcement to new regulations using minimal fresh labels.
-
UniNote: A Unified Embedding Model for Multimodal Representation and Ranking
UniNote proposes a two-stage trained unified embedding model (contrastive SFT then RL) for multimodal I2I retrieval that claims SOTA results and was deployed at Xiaohongshu with MRL for improved quality and efficiency.