TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.
DOTS: Learning to reason dynamically in LLMs via optimal reasoning trajectories search
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
End-to-end RL in authentic web environments produces LLM research agents that outperform prompt-engineering and RAG-based baselines by up to 28.9 and 7.2 points respectively while exhibiting emergent planning, cross-validation, and self-reflection.
citing papers explorer
-
TIME: Temporally Intelligent Meta-reasoning Engine for Context-Triggered Explicit Reasoning
TIME trains LLMs to trigger compact, context-triggered reasoning via time tags and tick events, improving TIMEBench scores while cutting explicit reasoning tokens by an order of magnitude.
-
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
End-to-end RL in authentic web environments produces LLM research agents that outperform prompt-engineering and RAG-based baselines by up to 28.9 and 7.2 points respectively while exhibiting emergent planning, cross-validation, and self-reflection.