Title resolution pending

Anthropic , title = · 2026

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Collider-Bench: Benchmarking AI Agents with Particle Physics Analysis Reproduction

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Collider-Bench is a new benchmark showing that current LLM agents cannot reliably reproduce LHC analyses at the level of a physicist-in-the-loop.

One prompt is not enough: Instruction Sensitivity Undermines Embedding Model Evaluation

cs.CL · 2026-05-21 · accept · novelty 6.0

Single-prompt evaluations of instruction-tuned embedding models misrepresent performance and allow any model to be ranked first by favorable prompt choice.

How Far Are We From True Auto-Research?

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

ResearchArena shows that agent-generated papers fail top-tier acceptance standards primarily due to fabricated results, underpowered experiments, and plan-execution mismatches that vary sharply by agent.

Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

Three Metapath2Vec variants create ingredient embeddings by walking a co-occurrence graph from recipes, a typed chemical compound graph from FlavorDB, or a controlled blend of both.

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

cs.AI · 2026-04-20

citing papers explorer

Showing 3 of 3 citing papers after filters.

Collider-Bench: Benchmarking AI Agents with Particle Physics Analysis Reproduction cs.LG · 2026-05-13 · unverdicted · none · ref 42
Collider-Bench is a new benchmark showing that current LLM agents cannot reliably reproduce LHC analyses at the level of a physicist-in-the-loop.
How Far Are We From True Auto-Research? cs.AI · 2026-05-18 · unverdicted · none · ref 4
ResearchArena shows that agent-generated papers fail top-tier acceptance standards primarily due to fabricated results, underpowered experiments, and plan-execution mismatches that vary sharply by agent.
Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings cs.AI · 2026-05-21 · unverdicted · none · ref 3
Three Metapath2Vec variants create ingredient embeddings by walking a co-occurrence graph from recipes, a typed chemical compound graph from FlavorDB, or a controlled blend of both.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer