Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of Artificial Intelligence on Knowledge Worker Productivity and Quality

Dell’Acqua, F · 2025 · arXiv 2025.21838

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

citation-role summary

background 2

citation-polarity summary

background 1 support 1

representative citing papers

AI co-mathematician: Accelerating mathematicians with agentic AI

cs.AI · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

An interactive AI workbench for mathematicians achieves 48% on FrontierMath Tier 4 and helped solve open problems in early tests.

A Technical Typology of AI Systems in Public Administration

cs.CY · 2026-06-30 · unverdicted · novelty 6.0

The paper defines five AI system categories for public administration and reports that 55% of 91 recent papers leave the system type underspecified while 31% study one type but motivate with another.

When Helpfulness Overrides Causal Caution: Context-Dependent Suppression and Recovery in LLMs

cs.AI · 2026-06-23 · unverdicted · novelty 6.0

LLMs suppress causal caution in practical advisory contexts (rates drop from 91.7-100% to 6.7-18.3%) but recover it with a self-correction prompt (to 71.4-100%).

BlueFin: Benchmarking LLM Agents on Financial Spreadsheets

cs.SE · 2026-05-29 · unverdicted · novelty 6.0

BlueFin is a new benchmark for LLM agents on financial spreadsheets showing frontier models score below 50% with weaknesses in dynamic correctness.

Queue & AI: When Faster Tasks Slow Down the Workflow

cs.CY · 2026-05-26 · unverdicted · novelty 6.0

A queueing model of AI task processing identifies a 'variance wedge' where mean task speed falls but system delay rises due to rework and reduced oversight under congestion.

The Open-Box Fallacy: Why AI Deployment Needs a Calibrated Verification Regime

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

AI deployment in high-stakes areas requires domain-scoped calibrated verification with monitoring and revocation, using a proposed six-component Verification Coverage standard instead of mechanistic interpretability.

Human Capital, AI, and Labor Commoditization

econ.GN · 2026-06-20 · unverdicted · novelty 5.0 · 2 refs

Difference-in-differences analysis around ChatGPT release shows commoditization of labor in AI-exposed job categories on Upwork, with declining human capital importance and rising price importance.

Position: AI as Part of Self -- Extending the Mind Requires Cognitive Co-Regulation

cs.HC · 2026-05-15 · unverdicted · novelty 5.0

The paper claims that alignment requires treating AI as part of the self through cognitive co-regulation, identifying risks like deskilling and automation bias while drawing on System 0 cognition theory.

Jagged AI in Scientific Peer Review: Evidence from POMP Data Analysis

stat.AP · 2026-05-08 · unverdicted · novelty 5.0 · 2 refs

AI peer reviewers for POMP analyses show jagged performance: strong on technical error detection and invalid inference but weak on interpretive errors, narrative coherence, and domain-informed critique.

From Exposure to Adoption: Generative AI in European Workplaces

econ.GN · 2026-04-20 · unverdicted · novelty 5.0

Generative AI adoption in Europe ranges from under 3% to 25%, is steeper for skilled workers in abstract-task jobs and in digitally advanced countries with training, shows a gender gap in exposed roles, and has produced no detectable shift in reported task content so far.

Hallucinations in Organization-backed AI advisors: Evidence about Skepticism, Verification, and Reliance in Goal-Directed Use

cs.HC · 2026-06-22 · unverdicted · novelty 4.0

Literature review synthesizing evidence on user skepticism, verification, and reliance with hallucinating AI advisors, noting that output-related cues like warnings show weak effects and that content category has not been experimentally varied.

citing papers explorer

Showing 1 of 1 citing paper after filters.

BlueFin: Benchmarking LLM Agents on Financial Spreadsheets cs.SE · 2026-05-29 · unverdicted · none · ref 5
BlueFin is a new benchmark for LLM agents on financial spreadsheets showing frontier models score below 50% with weaknesses in dynamic correctness.

Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of Artificial Intelligence on Knowledge Worker Productivity and Quality

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer