Technical report, John F

Alex Havrilla, Maia Iyer · 2024 · arXiv 2402.04004

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening

cs.CV · 2026-05-17 · unverdicted · novelty 5.0

SafeLens presents a fast-and-slow video guardrail framework that filters the SafeWatch dataset to 2.4% and adds Chain-of-Thought traces to achieve state-of-the-art moderation performance at reduced inference cost.

Multilinguality at the Edge: Developing Language Models for the Global South

cs.CL · 2026-04-23 · unverdicted · novelty 5.0

A survey of 232 papers on the intersection of multilingual language modeling and edge deployment identifies the 'last mile' challenge for Global South communities and offers recommendations for more inclusive NLP.

DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training

cs.LG · 2025-12-03 · unverdicted · novelty 5.0

DVPO learns token-level value distributions and uses asymmetric risk regularization to contract lower tails while expanding upper tails, outperforming PPO and GRPO under noisy supervision in dialogue, math, and QA tasks.

citing papers explorer

Showing 3 of 3 citing papers.

SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening cs.CV · 2026-05-17 · unverdicted · none · ref 44
SafeLens presents a fast-and-slow video guardrail framework that filters the SafeWatch dataset to 2.4% and adds Chain-of-Thought traces to achieve state-of-the-art moderation performance at reduced inference cost.
Multilinguality at the Edge: Developing Language Models for the Global South cs.CL · 2026-04-23 · unverdicted · none · ref 7
A survey of 232 papers on the intersection of multilingual language modeling and edge deployment identifies the 'last mile' challenge for Global South communities and offers recommendations for more inclusive NLP.
DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training cs.LG · 2025-12-03 · unverdicted · none · ref 10
DVPO learns token-level value distributions and uses asymmetric risk regularization to contract lower tails while expanding upper tails, outperforming PPO and GRPO under noisy supervision in dialogue, math, and QA tasks.

Technical report, John F

fields

years

verdicts

representative citing papers

citing papers explorer