Francesco Salvi, Manoel Horta Ribeiro, Riccardo Gallotti, and Robert West

URL https://arxiv · arXiv 2305.14930

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

LLMs organize prompted social roles along a dominant, stable, and causally steerable granularity axis in representation space that runs from micro to macro levels.

Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs

cs.CL · 2026-04-21 · unverdicted · novelty 6.0

Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.

Mitigating LLM biases toward spurious social contexts using direct preference optimization

cs.AI · 2026-04-02 · unverdicted · novelty 6.0

Debiasing-DPO reduces bias to spurious social contexts by 84% and improves predictive accuracy by 52% on average for LLMs evaluating U.S. classroom transcripts.

Beyond Inefficiency: Systemic Costs of Incivility in Multi-Agent Monte Carlo Simulations

cs.AI · 2026-05-12 · unverdicted · novelty 5.0

Monte Carlo simulations of LLM agents confirm that toxic debates take 25% longer to converge, with larger delays in smaller models, and show a first-mover advantage independent of toxicity.

citing papers explorer

Showing 4 of 4 citing papers.

The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models cs.AI · 2026-05-07 · unverdicted · none · ref 29
LLMs organize prompted social roles along a dominant, stable, and causally steerable granularity axis in representation space that runs from micro to macro levels.
Talking to a Know-It-All GPT or a Second-Guesser Claude? How Repair reveals unreliable Multi-Turn Behavior in LLMs cs.CL · 2026-04-21 · unverdicted · none · ref 113
Each tested LLM shows its own characteristic unreliability when engaging in repair during extended math-question dialogues.
Mitigating LLM biases toward spurious social contexts using direct preference optimization cs.AI · 2026-04-02 · unverdicted · none · ref 22
Debiasing-DPO reduces bias to spurious social contexts by 84% and improves predictive accuracy by 52% on average for LLMs evaluating U.S. classroom transcripts.
Beyond Inefficiency: Systemic Costs of Incivility in Multi-Agent Monte Carlo Simulations cs.AI · 2026-05-12 · unverdicted · none · ref 18
Monte Carlo simulations of LLM agents confirm that toxic debates take 25% longer to converge, with larger delays in smaller models, and show a first-mover advantage independent of toxicity.

Francesco Salvi, Manoel Horta Ribeiro, Riccardo Gallotti, and Robert West

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer