Must allow pets

often feature simplified instruction-following setups where agents interact autonomously with complete information upfront, lacking realistic human-in-the-loop interaction · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

COMPASS: Benchmarking Constrained Optimization in LLM Agents

cs.LG · 2025-10-08 · unverdicted · novelty 7.0

COMPASS benchmark shows LLM agents reach 70-90% feasibility but only 20-60% optimality on constrained travel planning tasks, attributing the gap to insufficient search space exploration rather than tool use.

citing papers explorer

Showing 1 of 1 citing paper.

COMPASS: Benchmarking Constrained Optimization in LLM Agents cs.LG · 2025-10-08 · unverdicted · none · ref 5
COMPASS benchmark shows LLM agents reach 70-90% feasibility but only 20-60% optimality on constrained travel planning tasks, attributing the gap to insufficient search space exploration rather than tool use.

Must allow pets

fields

years

verdicts

representative citing papers

citing papers explorer