pith. sign in

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 3 method 1

citation-polarity summary

years

2026 4 2025 2

representative citing papers

Mind DeepResearch Technical Report

cs.AI · 2026-04-16 · unverdicted · novelty 5.0

MindDR combines a Planning Agent, DeepSearch Agent, and Report Agent with SFT cold-start, Search-RL, Report-RL, and preference alignment to reach competitive scores on research benchmarks using 30B-scale models.

citing papers explorer

Showing 6 of 6 citing papers.