ValueFlow: Measuring the Propagation of Value Perturbations in Multi-Agent LLM Systems

Chuke Liu; Hua Shen; Jinnuo Liu

arxiv: 2602.08567 · v2 · pith:FSCQXLQKnew · submitted 2026-02-09 · 💻 cs.MA · cs.CL

ValueFlow: Measuring the Propagation of Value Perturbations in Multi-Agent LLM Systems

Jinnuo Liu , Chuke Liu , Hua Shen This is my paper

classification 💻 cs.MA cs.CL

keywords valuemulti-agentsystemsvalueflowagentperturbationssusceptibilityacross

0 comments

read the original abstract

Multi-agent large language model (LLM) systems increasingly consist of agents that observe and respond to one another's outputs. While value alignment is typically evaluated for isolated models, how value perturbations propagate through agent interactions remains poorly understood. We present ValueFlow, a perturbation-based framework that measures value drift in multi-agent systems via a 56-value valuation dataset derived from the Schwartz Value Survey, with agent value orientations scored using an LLM-as-a-judge protocol. ValueFlow decomposes value drift into agent-level response behavior and system-level structural effects, captured by two metrics: \b{eta}-susceptibility, an agent's sensitivity to perturbed peer value signals, and system susceptibility (SS), the effect of node-level perturbations on final system outputs.Experiments span across value dimensions, backbones, personas, and topologies, showing that susceptibility varies sharply across values and is strongly shaped by interaction structure, indicating that value alignment in multi-agent systems is a system-level property, not just an agent-level one. ValueFlow thus provides a principled basis for auditing and mitigating value propagation in deployed multi-agent systems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions
cs.CL 2026-05 unverdicted novelty 6.0

LLMs exhibit pseudo-deliberation, with consistent value-action misalignment in generated dialogues despite reasoning, as measured by the new VALDI framework across 4941 scenarios.