Forty-first International Conference on Machine Learning , year=

MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.

ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution

cs.CL · 2025-09-17 · unverdicted · novelty 6.0

ShinkaEvolve improves sample efficiency in LLM-driven program evolution via parent sampling, code novelty rejection-sampling, and bandit LLM ensemble selection, achieving new SOTA circle packing with 150 samples and gains on math reasoning and competitive programming tasks.

citing papers explorer

Showing 2 of 2 citing papers.

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces cs.AI · 2026-05-09 · unverdicted · none · ref 36
OPT-BENCH and OPT-Agent evaluate LLM self-optimization in large search spaces, showing stronger models improve via feedback but stay constrained by base capacity and below human performance.
ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution cs.CL · 2025-09-17 · unverdicted · none · ref 81
ShinkaEvolve improves sample efficiency in LLM-driven program evolution via parent sampling, code novelty rejection-sampling, and bandit LLM ensemble selection, achieving new SOTA circle packing with 150 samples and gains on math reasoning and competitive programming tasks.

Forty-first International Conference on Machine Learning , year=

fields

years

verdicts

representative citing papers

citing papers explorer