Hybridflow: A flexible and efficient rlhf framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, Chuan Wu

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Leveraging Latent Visual Reasoning in Silence

cs.CV · 2026-05-18 · conditional · novelty 6.0

Latent visual reasoning improves multimodal models via training effects even without using latent tokens at inference, enabled by an attention-based RL reward that promotes interaction with text tokens.

VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

VerifyMAS improves failure attribution in LLM multi-agent systems via hypothesis verification on full trajectories, error taxonomy-based data construction, and fine-tuned verifier models, outperforming prior direct-prediction methods on Aegis-Bench and Who&When.

citing papers explorer

Showing 2 of 2 citing papers.

Leveraging Latent Visual Reasoning in Silence cs.CV · 2026-05-18 · conditional · none · ref 25
Latent visual reasoning improves multimodal models via training effects even without using latent tokens at inference, enabled by an attention-based RL reward that promotes interaction with text tokens.
VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems cs.CL · 2026-05-17 · unverdicted · none · ref 22
VerifyMAS improves failure attribution in LLM multi-agent systems via hypothesis verification on full trajectories, error taxonomy-based data construction, and fine-tuned verifier models, outperforming prior direct-prediction methods on Aegis-Bench and Who&When.

Hybridflow: A flexible and efficient rlhf framework

fields

years

verdicts

representative citing papers

citing papers explorer