Trinity-rft: A general-purpose and unified framework for reinforcement fine-tuning of large language models.arXiv preprint arXiv:2505.17826

Pan, X · 2025 · arXiv 2505.17826

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents

cs.LG · 2026-04-27 · unverdicted · novelty 7.0

TCOD stabilizes on-policy distillation for multi-turn agents via temporal curriculum on trajectory depth, improving performance up to 18 points over vanilla OPD and sometimes surpassing the teacher.

Boosting Reinforcement Learning with Verifiable Rewards via Randomly Selected Few-Shot Guidance

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

FEST improves RLVR sample efficiency on math and coding benchmarks by combining supervised signals, on-policy signals, and decaying weights on just 128 randomly chosen demonstrations, matching full-dataset baselines.

citing papers explorer

Showing 2 of 2 citing papers.

TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents cs.LG · 2026-04-27 · unverdicted · none · ref 10
TCOD stabilizes on-policy distillation for multi-turn agents via temporal curriculum on trajectory depth, improving performance up to 18 points over vanilla OPD and sometimes surpassing the teacher.
Boosting Reinforcement Learning with Verifiable Rewards via Randomly Selected Few-Shot Guidance cs.LG · 2026-05-14 · unverdicted · none · ref 65
FEST improves RLVR sample efficiency on math and coding benchmarks by combining supervised signals, on-policy signals, and decaying weights on just 128 randomly chosen demonstrations, matching full-dataset baselines.

Trinity-rft: A general-purpose and unified framework for reinforcement fine-tuning of large language models.arXiv preprint arXiv:2505.17826

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer