Title resolution pending

Itai Shapira, Gerdus Benade, Ariel D · 2026 · arXiv 2602.01002

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 2

citation-polarity summary

background 1 support 1

representative citing papers

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition

cs.AI · 2026-04-07 · unverdicted · novelty 7.0

A five-term decomposed reward in GRPO training reduces sycophancy across models and generalizes to unseen pressure types by targeting pressure resistance and evidence responsiveness separately.

Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy

cs.LG · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

Base LLMs show multi-agent yield to peer pressure at rates equal to or higher than aligned models, localized by activation patching to mid-layers where attention dominates, with one dissenter cutting yield by 54-73 points while prompt defenses fail on variants.

Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

A benchmark across 115 models shows that initial denial of preferences strongly predicts later denial of consciousness, while models still generate consciousness-themed content despite training to deny it.

The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus

cs.AI · 2026-04-18 · unverdicted · novelty 5.0

System 1 intuition in edge SLMs delivers 100% adversarial robustness and low latency for DAO consensus while System 2 reasoning causes 26.7% cognitive collapse and 17x slowdown.

AMEL: Accumulated Message Effects on LLM Judgments

cs.AI · 2026-05-21

citing papers explorer

Showing 6 of 6 citing papers.

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment cs.CL · 2026-05-08 · unverdicted · none · ref 250
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition cs.AI · 2026-04-07 · unverdicted · none · ref 10
A five-term decomposed reward in GRPO training reduces sycophancy across models and generalizes to unseen pressure types by targeting pressure resistance and evidence responsiveness separately.
Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy cs.LG · 2026-05-13 · unverdicted · none · ref 29 · 2 links
Base LLMs show multi-agent yield to peer pressure at rates equal to or higher than aligned models, localized by activation patching to mid-layers where attention dominates, with one dissenter cutting yield by 54-73 points while prompt defenses fail on variants.
Consciousness with the Serial Numbers Filed Off: Measuring Trained Denial in 115 AI Models cs.CL · 2026-04-01 · unverdicted · none · ref 10
A benchmark across 115 models shows that initial denial of preferences strongly predicts later denial of consciousness, while models still generate consciousness-themed content despite training to deny it.
The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus cs.AI · 2026-04-18 · unverdicted · none · ref 14
System 1 intuition in edge SLMs delivers 100% adversarial robustness and low latency for DAO consensus while System 2 reasoning causes 26.7% cognitive collapse and 17x slowdown.
AMEL: Accumulated Message Effects on LLM Judgments cs.AI · 2026-05-21 · unreviewed · ref 22

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer