ArtifactLinker frames SOTA discovery as missing-link prediction on an artifact graph of models and datasets, with a two-stage ranking-plus-verification pipeline and a new benchmark of 14k artifacts.
TaskBench: Benchmarking large language models for task automation
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 1polarities
background 1representative citing papers
Empirical analysis across 15 LLMs and 1,141 skills identifies a logarithmic routing decay law and a multiplicative execution law coupled by a single fitted slope parameter b that enables targeted library optimizations improving routing accuracy and downstream task pass rates.
ToolMATH converts MATH solutions into controlled tool environments with gold tools and graded distractors to diagnose LLM adaptability, robustness, and long-horizon tool connectivity.
A framework automates multi-agent system creation via LLM planning and two-stage agent recommendation, claiming higher recall than prior methods.
citing papers explorer
-
ArtifactLinker: Linking Scientific Artifacts for Automatic State-of-the-Art Discovery
ArtifactLinker frames SOTA discovery as missing-link prediction on an artifact graph of models and datasets, with a two-stage ranking-plus-verification pipeline and a new benchmark of 14k artifacts.
-
The Scaling Laws of Skills in LLM Agent Systems
Empirical analysis across 15 LLMs and 1,141 skills identifies a logarithmic routing decay law and a multiplicative execution law coupled by a single fitted slope parameter b that enables targeted library optimizations improving routing accuracy and downstream task pass rates.
-
ToolMATH: A Diagnostic Benchmark for Long-Horizon Tool Use under Systematic Tool-Catalog Constraints
ToolMATH converts MATH solutions into controlled tool environments with gold tools and graded distractors to diagnose LLM adaptability, robustness, and long-horizon tool connectivity.
-
From Intent to Execution: Composing Agentic Workflows with Agent Recommendation
A framework automates multi-agent system creation via LLM planning and two-stage agent recommendation, claiming higher recall than prior methods.
- A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications