Latent visual reasoning improves multimodal models via training effects even without using latent tokens at inference, enabled by an attention-based RL reward that promotes interaction with text tokens.
Hybridflow: A flexible and efficient rlhf framework
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2representative citing papers
VerifyMAS improves failure attribution in LLM multi-agent systems via hypothesis verification on full trajectories, error taxonomy-based data construction, and fine-tuned verifier models, outperforming prior direct-prediction methods on Aegis-Bench and Who&When.
citing papers explorer
-
Leveraging Latent Visual Reasoning in Silence
Latent visual reasoning improves multimodal models via training effects even without using latent tokens at inference, enabled by an attention-based RL reward that promotes interaction with text tokens.
-
VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems
VerifyMAS improves failure attribution in LLM multi-agent systems via hypothesis verification on full trajectories, error taxonomy-based data construction, and fine-tuned verifier models, outperforming prior direct-prediction methods on Aegis-Bench and Who&When.