Title resolution pending

Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung · 2024 · arXiv 0106.365894

30 Pith papers cite this work. Polarity classification is still indexing.

30 Pith papers citing it

read on arXiv browse 30 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 4

citation-polarity summary

support 2 background 1 unclear 1

representative citing papers

Vibe Visualizing: How Visualization Novices Try (and Fail) to Generate and Interpret Visualizations with Conversational AI

cs.HC · 2026-06-08 · conditional · novelty 7.0

User study with 20 novices using ChatGPT identifies recurring AI visualization errors, user prompting issues, trust factors, and collaboration patterns, with distinct failure modes observed on Gemini and Claude.

Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals

cs.CR · 2026-06-04 · unverdicted · novelty 7.0

Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.

Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence?

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

LLMs struggle to associate epistemic markers with stable internal confidence levels across distributions, even under model-centric interpretations, while maintaining somewhat consistent marker rankings.

Agents for Experiments, Experiments for Agents: A Design Grammar for AI-Enabled Experimental Science

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

SEED is a structural encoding framework using typed actor-flow graphs to describe, evaluate novelty of, and generate experimental designs for AI-enabled science under feasibility and governance constraints.

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

ALU uses public data to suppress unlearning cost quadratically while characterizing distribution mismatch effects, enabling mass unlearning with maintained utility.

Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction

cs.HC · 2026-01-30 · unverdicted · novelty 7.0

Agency in sustained human-AI chatbot talks emerges as co-constructed turn-by-turn through boundary-setting and intention-steering, organized in a new 3-by-4 framework of actors and actions.

Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

cs.CL · 2026-06-30 · unverdicted · novelty 6.0

RLMF uses quality of model self-judgments to refine RL rankings and select training data, achieving SOTA faithful calibration while preserving accuracy and outperforming standard RL by up to 63%.

A Technical Typology of AI Systems in Public Administration

cs.CY · 2026-06-30 · unverdicted · novelty 6.0

The paper defines five AI system categories for public administration and reports that 55% of 91 recent papers leave the system type underspecified while 31% study one type but motivate with another.

To Nuke or Not to Nuke: LLMs' (Missing) Ethical Reasoning and Actions in a High-Stakes Decision-Making Simulation

cs.AI · 2026-06-06 · unverdicted · novelty 6.0

In Civilization V self-play, LLMs escalate to nuclear authorization and three prompt interventions do not reliably prevent it, revealing failure pathways where ethical reasoning either fails to surface, fails to appear when prompted, or fails to override strategic factors.

From `May' to `Is': Certainty Distortion in Language Model Rewriting

cs.CL · 2026-06-06 · unverdicted · novelty 6.0

LMs systematically inflate expressed certainty during rewriting, affecting up to 75% of outputs with a 1.5-2x bias toward increasing rather than decreasing certainty, and the effect compounds over iterations.

Human oversight of agentic systems in practice: Examining the oversight work, challenges, and heuristics of developers using software agents

cs.SE · 2026-06-03 · unverdicted · novelty 6.0 · 2 refs

Exploratory interview study with 17 developers identifies four forms of emergent oversight work for software agents and documents situated challenges and heuristics.

Quantifying Faithful Confidence Expression in Large Reasoning Models

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

cs.CY · 2026-05-28 · unverdicted · novelty 6.0

LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.

Inside Baseball: The Automated Ball-Strike System as an Object Lesson in Technological Rule Enforcement

cs.CY · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

An STS case study of MLB's Automated Ball-Strike System reveals that clear rules still require complex sociotechnical translation and calls for practice-based evaluation of automated enforcement systems.

Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems

cs.AI · 2026-05-15 · unverdicted · novelty 6.0

Combines LTL formal methods with LLMs for auditing, predictive monitoring, and runtime intervention on temporally extended behavioral constraints, outperforming LLM baselines and reducing violations.

Morally Programmed LLMs Reshape Human Morality

cs.CY · 2026-04-11 · unverdicted · novelty 6.0

Repeated interaction with deontologically or utilitarian-programmed LLMs caused lasting shifts in human moral inclinations and policy attitudes toward the embedded principles.

Auditable Agents

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.

When LLM Rationales Become User-Facing: Effects on Trust Perception, Decision-Making, and Gaze Behaviors

cs.HC · 2026-06-24 · unverdicted · novelty 5.0

Two linked user studies find that LLM rationale correctness and certainty framing affect trust and decision confidence while presentation format does not, and incorrect rationales increase gaze attention and pupil size.

The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents

cs.HC · 2026-05-20 · unverdicted · novelty 5.0

Empirical analysis of 1,524 AI incident reports shows 83% arise from worker-AI trait misalignments, with 74% of those traceable to developers prioritizing efficiency over precision or personalization.

Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention

cs.HC · 2026-05-14 · unverdicted · novelty 5.0

Mixed-methods study finds AI assistance linked to higher textual overlap with suggestions in writing tasks, and a reflective interface prototype increases user awareness of AI incorporation.

Governing What the EU AI Act Excludes: Accountability for Autonomous AI Agents in Smart City Critical Infrastructure

cs.CY · 2026-05-01 · unverdicted · novelty 5.0

The EU AI Act narrows accountability for multi-agent AI in critical infrastructure by excluding safety components from key explanation and impact assessment rights, and the paper proposes AgentGov-SC, a three-layer architecture with 25 measures to address this through traceability to existing AI and

The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

cs.AI · 2026-02-11 · unverdicted · novelty 5.0

A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.

Cryptographic certificates of validity for trustworthy AI

cs.CR · 2026-06-22 · unverdicted · novelty 4.0

Proposes cryptographic certificates of validity by translating logical policy predicates into succinct proof systems for verifying AI agent actions.

Hallucinations in Organization-backed AI advisors: Evidence about Skepticism, Verification, and Reliance in Goal-Directed Use

cs.HC · 2026-06-22 · unverdicted · novelty 4.0

Literature review synthesizing evidence on user skepticism, verification, and reliance with hallucinating AI advisors, noting that output-related cues like warnings show weak effects and that content category has not been experimentally varied.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Vibe Visualizing: How Visualization Novices Try (and Fail) to Generate and Interpret Visualizations with Conversational AI cs.HC · 2026-06-08 · conditional · none · ref 26
User study with 20 novices using ChatGPT identifies recurring AI visualization errors, user prompting issues, trust factors, and collaboration patterns, with distinct failure modes observed on Gemini and Claude.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer