pith. sign in

Blade: Benchmarking language model agents for data-driven science

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

dataset 2 background 1

citation-polarity summary

years

2026 2 2025 2

representative citing papers

Neurodata Without Boredom: Benchmarking Agentic AI for Data Reuse

cs.LG · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

AI agents handle individual data-loading and reformatting steps on neuroscience datasets but rarely complete fully error-free end-to-end pipelines, and AI judges are unreliable without ground-truth references.

citing papers explorer

Showing 4 of 4 citing papers.