Title resolution pending

Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung · 2024 · arXiv 0106.365894

30 Pith papers cite this work. Polarity classification is still indexing.

30 Pith papers citing it

read on arXiv browse 30 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 4

citation-polarity summary

support 2 background 1 unclear 1

representative citing papers

Vibe Visualizing: How Visualization Novices Try (and Fail) to Generate and Interpret Visualizations with Conversational AI

cs.HC · 2026-06-08 · conditional · novelty 7.0

User study with 20 novices using ChatGPT identifies recurring AI visualization errors, user prompting issues, trust factors, and collaboration patterns, with distinct failure modes observed on Gemini and Claude.

Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals

cs.CR · 2026-06-04 · unverdicted · novelty 7.0

Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.

Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence?

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

LLMs struggle to associate epistemic markers with stable internal confidence levels across distributions, even under model-centric interpretations, while maintaining somewhat consistent marker rankings.

Agents for Experiments, Experiments for Agents: A Design Grammar for AI-Enabled Experimental Science

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

SEED is a structural encoding framework using typed actor-flow graphs to describe, evaluate novelty of, and generate experimental designs for AI-enabled science under feasibility and governance constraints.

Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

ALU uses public data to suppress unlearning cost quadratically while characterizing distribution mismatch effects, enabling mass unlearning with maintained utility.

Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction

cs.HC · 2026-01-30 · unverdicted · novelty 7.0

Agency in sustained human-AI chatbot talks emerges as co-constructed turn-by-turn through boundary-setting and intention-steering, organized in a new 3-by-4 framework of actors and actions.

Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

cs.CL · 2026-06-30 · unverdicted · novelty 6.0

RLMF uses quality of model self-judgments to refine RL rankings and select training data, achieving SOTA faithful calibration while preserving accuracy and outperforming standard RL by up to 63%.

A Technical Typology of AI Systems in Public Administration

cs.CY · 2026-06-30 · unverdicted · novelty 6.0

The paper defines five AI system categories for public administration and reports that 55% of 91 recent papers leave the system type underspecified while 31% study one type but motivate with another.

To Nuke or Not to Nuke: LLMs' (Missing) Ethical Reasoning and Actions in a High-Stakes Decision-Making Simulation

cs.AI · 2026-06-06 · unverdicted · novelty 6.0

In Civilization V self-play, LLMs escalate to nuclear authorization and three prompt interventions do not reliably prevent it, revealing failure pathways where ethical reasoning either fails to surface, fails to appear when prompted, or fails to override strategic factors.

From `May' to `Is': Certainty Distortion in Language Model Rewriting

cs.CL · 2026-06-06 · unverdicted · novelty 6.0

LMs systematically inflate expressed certainty during rewriting, affecting up to 75% of outputs with a 1.5-2x bias toward increasing rather than decreasing certainty, and the effect compounds over iterations.

Human oversight of agentic systems in practice: Examining the oversight work, challenges, and heuristics of developers using software agents

cs.SE · 2026-06-03 · unverdicted · novelty 6.0 · 2 refs

Exploratory interview study with 17 developers identifies four forms of emergent oversight work for software agents and documents situated challenges and heuristics.

Quantifying Faithful Confidence Expression in Large Reasoning Models

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

cs.CY · 2026-05-28 · unverdicted · novelty 6.0

LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.

Inside Baseball: The Automated Ball-Strike System as an Object Lesson in Technological Rule Enforcement

cs.CY · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

An STS case study of MLB's Automated Ball-Strike System reveals that clear rules still require complex sociotechnical translation and calls for practice-based evaluation of automated enforcement systems.

Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems

cs.AI · 2026-05-15 · unverdicted · novelty 6.0

Combines LTL formal methods with LLMs for auditing, predictive monitoring, and runtime intervention on temporally extended behavioral constraints, outperforming LLM baselines and reducing violations.

Morally Programmed LLMs Reshape Human Morality

cs.CY · 2026-04-11 · unverdicted · novelty 6.0

Repeated interaction with deontologically or utilitarian-programmed LLMs caused lasting shifts in human moral inclinations and policy attitudes toward the embedded principles.

Auditable Agents

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.

When LLM Rationales Become User-Facing: Effects on Trust Perception, Decision-Making, and Gaze Behaviors

cs.HC · 2026-06-24 · unverdicted · novelty 5.0

Two linked user studies find that LLM rationale correctness and certainty framing affect trust and decision confidence while presentation format does not, and incorrect rationales increase gaze attention and pupil size.

The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents

cs.HC · 2026-05-20 · unverdicted · novelty 5.0

Empirical analysis of 1,524 AI incident reports shows 83% arise from worker-AI trait misalignments, with 74% of those traceable to developers prioritizing efficiency over precision or personalization.

Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention

cs.HC · 2026-05-14 · unverdicted · novelty 5.0

Mixed-methods study finds AI assistance linked to higher textual overlap with suggestions in writing tasks, and a reflective interface prototype increases user awareness of AI incorporation.

Governing What the EU AI Act Excludes: Accountability for Autonomous AI Agents in Smart City Critical Infrastructure

cs.CY · 2026-05-01 · unverdicted · novelty 5.0

The EU AI Act narrows accountability for multi-agent AI in critical infrastructure by excluding safety components from key explanation and impact assessment rights, and the paper proposes AgentGov-SC, a three-layer architecture with 25 measures to address this through traceability to existing AI and

The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation

cs.AI · 2026-02-11 · unverdicted · novelty 5.0

A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.

Cryptographic certificates of validity for trustworthy AI

cs.CR · 2026-06-22 · unverdicted · novelty 4.0

Proposes cryptographic certificates of validity by translating logical policy predicates into succinct proof systems for verifying AI agent actions.

Hallucinations in Organization-backed AI advisors: Evidence about Skepticism, Verification, and Reliance in Goal-Directed Use

cs.HC · 2026-06-22 · unverdicted · novelty 4.0

Literature review synthesizing evidence on user skepticism, verification, and reliance with hallucinating AI advisors, noting that output-related cues like warnings show weak effects and that content category has not been experimentally varied.

citing papers explorer

Showing 29 of 29 citing papers after filters.

Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals cs.CR · 2026-06-04 · unverdicted · none · ref 3
Introduces a cooperative Recuse Signal for LLM agents and reports 100% recusal in a pilot when the signal is present versus 100% task completion without it.
Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence? cs.CL · 2026-05-27 · unverdicted · none · ref 40
LLMs struggle to associate epistemic markers with stable internal confidence levels across distributions, even under model-centric interpretations, while maintaining somewhat consistent marker rankings.
Agents for Experiments, Experiments for Agents: A Design Grammar for AI-Enabled Experimental Science cs.AI · 2026-05-18 · unverdicted · none · ref 31
SEED is a structural encoding framework using typed actor-flow graphs to describe, evaluate novelty of, and generate experimental designs for AI-enabled science under feasibility and governance constraints.
Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data cs.LG · 2026-05-11 · unverdicted · none · ref 115
ALU uses public data to suppress unlearning cost quadratically while characterizing distribution mismatch effects, enabling mass unlearning with maintained utility.
Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction cs.HC · 2026-01-30 · unverdicted · none · ref 19
Agency in sustained human-AI chatbot talks emerges as co-constructed turn-by-turn through boundary-setting and intention-steering, organized in a new 3-by-4 framework of actors and actions.
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs cs.CL · 2026-06-30 · unverdicted · none · ref 52
RLMF uses quality of model self-judgments to refine RL rankings and select training data, achieving SOTA faithful calibration while preserving accuracy and outperforming standard RL by up to 63%.
A Technical Typology of AI Systems in Public Administration cs.CY · 2026-06-30 · unverdicted · none · ref 228
The paper defines five AI system categories for public administration and reports that 55% of 91 recent papers leave the system type underspecified while 31% study one type but motivate with another.
To Nuke or Not to Nuke: LLMs' (Missing) Ethical Reasoning and Actions in a High-Stakes Decision-Making Simulation cs.AI · 2026-06-06 · unverdicted · none · ref 29
In Civilization V self-play, LLMs escalate to nuclear authorization and three prompt interventions do not reliably prevent it, revealing failure pathways where ethical reasoning either fails to surface, fails to appear when prompted, or fails to override strategic factors.
From `May' to `Is': Certainty Distortion in Language Model Rewriting cs.CL · 2026-06-06 · unverdicted · none · ref 90
LMs systematically inflate expressed certainty during rewriting, affecting up to 75% of outputs with a 1.5-2x bias toward increasing rather than decreasing certainty, and the effect compounds over iterations.
Human oversight of agentic systems in practice: Examining the oversight work, challenges, and heuristics of developers using software agents cs.SE · 2026-06-03 · unverdicted · none · ref 19 · 2 links
Exploratory interview study with 17 developers identifies four forms of emergent oversight work for software agents and documents situated challenges and heuristics.
Quantifying Faithful Confidence Expression in Large Reasoning Models cs.CL · 2026-06-02 · unverdicted · none · ref 31
A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.
Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms cs.CY · 2026-05-28 · unverdicted · none · ref 25
LM agents' changeable modules prevent persistent identity and sanction sensitivity, making reputation mechanisms structurally inapplicable and requiring protocol-based behavioral harnesses instead.
Inside Baseball: The Automated Ball-Strike System as an Object Lesson in Technological Rule Enforcement cs.CY · 2026-05-15 · unverdicted · none · ref 77 · 2 links
An STS case study of MLB's Automated Ball-Strike System reveals that clear rules still require complex sociotechnical translation and calls for practice-based evaluation of automated enforcement systems.
Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems cs.AI · 2026-05-15 · unverdicted · none · ref 29
Combines LTL formal methods with LLMs for auditing, predictive monitoring, and runtime intervention on temporally extended behavioral constraints, outperforming LLM baselines and reducing violations.
Morally Programmed LLMs Reshape Human Morality cs.CY · 2026-04-11 · unverdicted · none · ref 6
Repeated interaction with deontologically or utilitarian-programmed LLMs caused lasting shifts in human moral inclinations and policy attitudes toward the embedded principles.
Auditable Agents cs.AI · 2026-04-07 · unverdicted · none · ref 3
No agent system can be accountable without auditability, which requires five dimensions (action recoverability, lifecycle coverage, policy checkability, responsibility attribution, evidence integrity) and mechanisms for detect/enforce/recover.
When LLM Rationales Become User-Facing: Effects on Trust Perception, Decision-Making, and Gaze Behaviors cs.HC · 2026-06-24 · unverdicted · none · ref 11
Two linked user studies find that LLM rationale correctness and certainty framing affect trust and decision confidence while presentation format does not, and incorrect rationales increase gaze attention and pupil size.
The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents cs.HC · 2026-05-20 · unverdicted · none · ref 8
Empirical analysis of 1,524 AI incident reports shows 83% arise from worker-AI trait misalignments, with 74% of those traceable to developers prioritizing efficiency over precision or personalization.
Overreliance in Writing Tasks: Exploring Similarity-Based Measures of AI Influence on Writing and Proposing a Reflective Writing Interface Intervention cs.HC · 2026-05-14 · unverdicted · none · ref 26
Mixed-methods study finds AI assistance linked to higher textual overlap with suggestions in writing tasks, and a reflective interface prototype increases user awareness of AI incorporation.
Governing What the EU AI Act Excludes: Accountability for Autonomous AI Agents in Smart City Critical Infrastructure cs.CY · 2026-05-01 · unverdicted · none · ref 7
The EU AI Act narrows accountability for multi-agent AI in critical infrastructure by excluding safety components from key explanation and impact assessment rights, and the paper proposes AgentGov-SC, a three-layer architecture with 25 measures to address this through traceability to existing AI and
The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation cs.AI · 2026-02-11 · unverdicted · none · ref 20
A literature review concludes that pursuing consensus in data annotation creates biased AI by dismissing subjective disagreements and enforcing geographic hegemony, and proposes mapping diversity instead.
Cryptographic certificates of validity for trustworthy AI cs.CR · 2026-06-22 · unverdicted · none · ref 7
Proposes cryptographic certificates of validity by translating logical policy predicates into succinct proof systems for verifying AI agent actions.
Hallucinations in Organization-backed AI advisors: Evidence about Skepticism, Verification, and Reliance in Goal-Directed Use cs.HC · 2026-06-22 · unverdicted · none · ref 15
Literature review synthesizing evidence on user skepticism, verification, and reliance with hallucinating AI advisors, noting that output-related cues like warnings show weak effects and that content category has not been experimentally varied.
VArify: A Visual Analytics System for Verifying Knowledge Enhanced Large Language Model Responses in Food Science cs.HC · 2026-06-08 · unverdicted · none · ref 20
VArify introduces a tree visualization to support human verification of GraphRAG evidence for LLM responses in food science, evaluated in a study with six domain experts.
Personalized to Persuade: The Effects of Contextualization and Warmth on Trust and Reliance in Conversational AI cs.HC · 2026-05-29 · unverdicted · none · ref 18
A 2x2 between-subjects experiment finds contextualization lowers AI persuasiveness but warmth restores it through crossover interaction, with reliance invariant to design, trust predicting outcomes independently, and AI literacy decoupling trust from behavior.
The Decision to Verify: How Warmth and User Characteristics Shape Reliance on Conversational Agents for Information Search cs.HC · 2026-05-27 · unverdicted · none · ref 5
An experiment finds that overreliance on chatbots persists in hybrid AI-plus-web-search setups and is driven primarily by user characteristics rather than answer properties, with warmth increasing agreement on incorrect answers.
Prompt Governance? On Governing Technologies Governed by Natural Language cs.CY · 2026-04-29 · unverdicted · none · ref 44
Literature on system prompts for AI shows fragmented and contradictory claims that complicate policy efforts to use them as reliable governance mechanisms.
Civilizational Metamaterials: Engineering Coordination Under Capability Gradients and Structural Turbulence physics.soc-ph · 2026-05-29 · unverdicted · none · ref 9
Introduces phenomenological model R_eff = β(1-ρ)(1-τ)(1-γρτ) for coordination under AGI decision velocity, with phase transition and proposed randomized trial.
Understanding AI Trustworthiness: A Scoping Review of AIES & FAccT Articles cs.AI · 2025-10-24 · unverdicted · none · ref 32
A scoping review of AIES and FAccT literature concludes that AI trustworthiness research prioritizes technical precision over social, ethical, and institutional factors, leaving the sociotechnical nature of AI systems underexplored.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer