Can ai freelancers compete? benchmarking earnings, reliability, and task success at scale.arXiv preprint arXiv:2505.13511, 2025

David Noever, Forrest McKee · 2025 · arXiv 2505.13511

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

Flaws in the LLM Automation Narrative

stat.OT · 2026-06-09 · unverdicted · novelty 7.0

A new code-writing data analysis benchmark shows human experts outperforming a frontier LLM on average with lower performance variance.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Flaws in the LLM Automation Narrative stat.OT · 2026-06-09 · unverdicted · none · ref 42
A new code-writing data analysis benchmark shows human experts outperforming a frontier LLM on average with lower performance variance.