SEAL co-evolves LLM agents and environments via shared turn-level failure diagnoses, yielding +8.25 to +26.25 point gains on tool-use tasks with only 400 samples.
12 EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL P
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
EnvFactory automates synthesis of executable tool environments and natural multi-turn trajectories from authentic sources to enable efficient Agentic RL, delivering up to +15% gains on BFCLv3 with only 85 environments.
A hybrid deterministic-plus-semantic interception layer for continuous task-based authorization of multi-turn LLM agent tool invocations, with new multi-turn datasets and initial experiments.
citing papers explorer
-
SEAL: Synergistic Co-Evolution of Agents and Learning Environments
SEAL co-evolves LLM agents and environments via shared turn-level failure diagnoses, yielding +8.25 to +26.25 point gains on tool-use tasks with only 400 samples.
-
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
EnvFactory automates synthesis of executable tool environments and natural multi-turn trajectories from authentic sources to enable efficient Agentic RL, delivering up to +15% gains on BFCLv3 with only 85 environments.
-
Hybrid Inspection and Task-Based Access Control in Zero-Trust Agentic AI
A hybrid deterministic-plus-semantic interception layer for continuous task-based authorization of multi-turn LLM agent tool invocations, with new multi-turn datasets and initial experiments.