URL https://www.nature.com/articles/ s41586-024-07930-y

doi: 10 · 2024 · DOI 10.1038/s41586-024-07930-y

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

What Would GPT Click: Practical Effects of Human-AI Behavioral Misalignment and the Cost of Synthetic Participants in User Experience

cs.HC · 2026-05-18 · unverdicted · novelty 5.0

GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.

Predicting Performance of Symbolic and Prompt Programs with Examples

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

Proposes RAP, a retrieval-based approximate prior method, to predict performance of symbolic programs and LLM prompts on new tasks using a Bernoulli model and corpus-derived performance distributions.

A Reproducible Optimisation Protocol for Calibrating Prompt-Based Large Language Model Workflows in Evidence Synthesis

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

The paper introduces a reproducible optimization protocol for prompt-based LLM workflows in evidence synthesis that separates task definitions from prompt harnesses, optimizes the harness against metrics and examples, and preserves the result as an inspectable artefact.

citing papers explorer

Showing 3 of 3 citing papers.

What Would GPT Click: Practical Effects of Human-AI Behavioral Misalignment and the Cost of Synthetic Participants in User Experience cs.HC · 2026-05-18 · unverdicted · none · ref 31
GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.
Predicting Performance of Symbolic and Prompt Programs with Examples cs.LG · 2026-05-15 · unverdicted · none · ref 11
Proposes RAP, a retrieval-based approximate prior method, to predict performance of symbolic programs and LLM prompts on new tasks using a Bernoulli model and corpus-derived performance distributions.
A Reproducible Optimisation Protocol for Calibrating Prompt-Based Large Language Model Workflows in Evidence Synthesis cs.LG · 2026-05-07 · unverdicted · none · ref 11
The paper introduces a reproducible optimization protocol for prompt-based LLM workflows in evidence synthesis that separates task definitions from prompt harnesses, optimizes the harness against metrics and examples, and preserves the result as an inspectable artefact.

URL https://www.nature.com/articles/ s41586-024-07930-y

fields

years

verdicts

representative citing papers

citing papers explorer