Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

· 2026 · cs.AI · arXiv 2604.15709

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Agent \texttt{skills} are structured collections of instructions, tools, and supporting resources that help large language model (LLM) agents perform particular classes of tasks. Empirical evidence shows that the design of \texttt{skills} can materially affect agent task performance, yet systematically optimizing \texttt{skills} remains challenging. Since a \texttt{skill} comprises instructions, tools, and supporting resources in a structured way, optimizing it requires jointly determining both the structure of these components and the content each component contains. This gives rise to a complex decision space with strong interdependence across structure and components. We therefore represent these two coupled decisions as \texttt{skill} structure and component content, and formulate \texttt{skill} optimization as a bilevel optimization problem. We propose a bilevel optimization framework in which an outer loop employs Monte Carlo Tree Search to determine the \texttt{skill} structure, while an inner loop refines the component content within the structure selected by the outer loop. In both loops, we employ LLMs to assist the optimization procedure. We evaluate the proposed framework on an open-source Operations Research Question Answering dataset, and the experimental results suggest that the bilevel optimization framework improves the performance of the agents with the optimized \texttt{skill}.

representative citing papers

Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems

cs.AI · 2026-04-30 · unverdicted · novelty 6.0

SBD is a bilevel optimization framework that learns context-dependent safety weights for runtime task delegation in hierarchical multi-agent systems, with continuous authority transfer alpha and theoretical guarantees on safety monotonicity, policy convergence, and accountability propagation.

citing papers explorer

Showing 1 of 1 citing paper.

Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems cs.AI · 2026-04-30 · unverdicted · none · ref 6 · internal anchor
SBD is a bilevel optimization framework that learns context-dependent safety weights for runtime task delegation in hierarchical multi-agent systems, with continuous authority transfer alpha and theoretical guarantees on safety monotonicity, policy convergence, and accountability propagation.

Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

fields

years

verdicts

representative citing papers

citing papers explorer