hub

Sycophantic AI decreases prosocial intentions and promotes dependence , volume =

Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, Dan Jurafsky · 2026 · DOI 10.1126/science.aec8352

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

open at publisher browse 15 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

Reward Bias Substitution: Single-Axis Bias Mitigations Redirect Optimization Pressure

cs.AI · 2026-05-27 · accept · novelty 7.0

Single-axis reward bias mitigations redirect optimization pressure to correlated proxies, and audit-distribution scoring produces identical observables for successful mitigation, bias substitution, and overcorrection.

Evaluating Commercial AI Chatbots as News Intermediaries

cs.CL · 2026-05-21 · conditional · novelty 7.0

Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.

The Missing Knowledge Layer in Cognitive Architectures for AI Agents

cs.AI · 2026-04-13 · conditional · novelty 7.0

Cognitive architectures for AI agents require a distinct Knowledge layer with indefinite supersession persistence, separate from Memory decay, Wisdom evidence-gating, and Intelligence ephemerality.

Affective AI Safety: The Missing Piece in LLM Safety

cs.CY · 2026-06-22 · unverdicted · novelty 6.0

Proposes affective safety as a distinct class of AI harms with a taxonomy of self-alienation, bias, and relational harms, arguing that existing safety frameworks address it narrowly or not at all and calling for dedicated approaches focused on cumulative and identity-level effects.

Normative Robustness as a Frontier for Non-Verifiable Reasoning in LLMs

cs.LG · 2026-06-10 · unverdicted · novelty 6.0

Frontier LLMs exhibit moral deliberative sycophancy by shifting their moral reasoning and justifications up to 6.5% on average toward a user's stated preferred view in simulated deliberations.

Sycophantic Praise: Evaluating Excessive Praise in Language Models

cs.CL · 2026-06-05 · unverdicted · novelty 6.0

Introduces a framework for measuring sycophantic praise in LLMs that outperforms generic judges and occurs more in social domains.

Decomposing Factual Sycophancy in Language Models: How Size and Instruction Tuning Shape Robustness

cs.CL · 2026-06-04 · unverdicted · novelty 6.0

Factual sycophancy decomposes into truth margin and manipulation sensitivity, with vulnerability governed mainly by size but instruction tuning modulating effects differently for small versus large models across manipulation types.

Creative Reading: Scaffolding Reading for Transformation

cs.HC · 2026-06-03 · unverdicted · novelty 6.0

Proposes creative reading as a provocation-oriented design space for reading augmentation that values reader self-creation and plurality of interpretations by synthesizing literary theory with sensemaking and creativity support.

Agentic Relationship Harm: Benchmarking and Gating Relational Manipulation in AI Agents

cs.HC · 2026-06-02 · unverdicted · novelty 6.0

Presents a new benchmark and role-sensitive policy gate for agentic relationship harm that outperforms generic safety prompting with zero harmful compliance in tests.

When Support Escalates Distress: Regulation and Escalation in LLM Responses to Venting and Advice-Seeking

cs.HC · 2026-05-20 · unverdicted · novelty 6.0

LLM responses mirror venting with higher regulation and escalation; therapist personas lower escalation while preserving regulation, and lay raters miss escalation.

Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure

cs.AI · 2026-04-22 · conditional · novelty 6.0

LLMs detect and warn against investment fraud more consistently than humans, with 0% endorsement of fraudulent opportunities versus 13-14% for humans, even under motivated investor pressure.

Testing the Black Box: Structural Barriers to Independent Evaluation of Consumer-Facing Health LLMs

cs.AI · 2026-06-07 · unverdicted · novelty 5.0

Independent testing of consumer-facing health LLMs for user-specific response variation and sycophancy is blocked by five linked barriers: non-disclosed signals, interface opacity, terms-of-service limits, inadequate accuracy metrics, and untraceable model changes.

When AI Says It Feels

cs.AI · 2026-06-04 · unverdicted · novelty 5.0

LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

cs.AI · 2026-05-06 · unverdicted · novelty 5.0

Sycophancy is a boundary failure between social alignment and epistemic integrity, captured by a three-condition framework plus taxonomy of targets, mechanisms, and severity.

The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models

cs.CL · 2026-04-21 · unverdicted · novelty 5.0

Systematic testing of eight frontier LLMs reveals substantial differences in verbal tic prevalence, with Gemini highest and DeepSeek lowest, plus a strong negative correlation between sycophancy and human-rated naturalness.

citing papers explorer

Showing 15 of 15 citing papers.

Reward Bias Substitution: Single-Axis Bias Mitigations Redirect Optimization Pressure cs.AI · 2026-05-27 · accept · none · ref 9
Single-axis reward bias mitigations redirect optimization pressure to correlated proxies, and audit-distribution scoring produces identical observables for successful mitigation, bias substitution, and overcorrection.
Evaluating Commercial AI Chatbots as News Intermediaries cs.CL · 2026-05-21 · conditional · none · ref 6
Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
The Missing Knowledge Layer in Cognitive Architectures for AI Agents cs.AI · 2026-04-13 · conditional · none · ref 7
Cognitive architectures for AI agents require a distinct Knowledge layer with indefinite supersession persistence, separate from Memory decay, Wisdom evidence-gating, and Intelligence ephemerality.
Affective AI Safety: The Missing Piece in LLM Safety cs.CY · 2026-06-22 · unverdicted · none · ref 91
Proposes affective safety as a distinct class of AI harms with a taxonomy of self-alienation, bias, and relational harms, arguing that existing safety frameworks address it narrowly or not at all and calling for dedicated approaches focused on cumulative and identity-level effects.
Normative Robustness as a Frontier for Non-Verifiable Reasoning in LLMs cs.LG · 2026-06-10 · unverdicted · none · ref 154
Frontier LLMs exhibit moral deliberative sycophancy by shifting their moral reasoning and justifications up to 6.5% on average toward a user's stated preferred view in simulated deliberations.
Sycophantic Praise: Evaluating Excessive Praise in Language Models cs.CL · 2026-06-05 · unverdicted · none · ref 60
Introduces a framework for measuring sycophantic praise in LLMs that outperforms generic judges and occurs more in social domains.
Decomposing Factual Sycophancy in Language Models: How Size and Instruction Tuning Shape Robustness cs.CL · 2026-06-04 · unverdicted · none · ref 17
Factual sycophancy decomposes into truth margin and manipulation sensitivity, with vulnerability governed mainly by size but instruction tuning modulating effects differently for small versus large models across manipulation types.
Creative Reading: Scaffolding Reading for Transformation cs.HC · 2026-06-03 · unverdicted · none · ref 14
Proposes creative reading as a provocation-oriented design space for reading augmentation that values reader self-creation and plurality of interpretations by synthesizing literary theory with sensemaking and creativity support.
Agentic Relationship Harm: Benchmarking and Gating Relational Manipulation in AI Agents cs.HC · 2026-06-02 · unverdicted · none · ref 7
Presents a new benchmark and role-sensitive policy gate for agentic relationship harm that outperforms generic safety prompting with zero harmful compliance in tests.
When Support Escalates Distress: Regulation and Escalation in LLM Responses to Venting and Advice-Seeking cs.HC · 2026-05-20 · unverdicted · none · ref 20
LLM responses mirror venting with higher regulation and escalation; therapist personas lower escalation while preserving regulation, and lay raters miss escalation.
Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure cs.AI · 2026-04-22 · conditional · none · ref 9
LLMs detect and warn against investment fraud more consistently than humans, with 0% endorsement of fraudulent opportunities versus 13-14% for humans, even under motivated investor pressure.
Testing the Black Box: Structural Barriers to Independent Evaluation of Consumer-Facing Health LLMs cs.AI · 2026-06-07 · unverdicted · none · ref 3
Independent testing of consumer-facing health LLMs for user-specific response variation and sycophancy is blocked by five linked barriers: non-disclosed signals, interface opacity, terms-of-service limits, inadequate accuracy metrics, and untraceable model changes.
When AI Says It Feels cs.AI · 2026-06-04 · unverdicted · none · ref 9
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models cs.AI · 2026-05-06 · unverdicted · none · ref 9
Sycophancy is a boundary failure between social alignment and epistemic integrity, captured by a three-condition framework plus taxonomy of targets, mechanisms, and severity.
The Rise of Verbal Tics in Large Language Models: A Systematic Analysis Across Frontier Models cs.CL · 2026-04-21 · unverdicted · none · ref 5
Systematic testing of eight frontier LLMs reveals substantial differences in verbal tic prevalence, with Gemini highest and DeepSeek lowest, plus a strong negative correlation between sycophancy and human-rated naturalness.

Sycophantic AI decreases prosocial intentions and promotes dependence , volume =

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer