Diagnosis Is Not Prescription: Linguistic Co-Adaptation Explains Patching Hazards in LLM Pipelines

Kim Dongchan; Yoon Jeonghun

arxiv: 2605.21958 · v1 · pith:MNCOP3JNnew · submitted 2026-05-21 · 💻 cs.CL

Diagnosis Is Not Prescription: Linguistic Co-Adaptation Explains Patching Hazards in LLM Pipelines

Yoon Jeonghun , Kim Dongchan This is my paper

Pith reviewed 2026-05-22 06:51 UTC · model grok-4.3

classification 💻 cs.CL

keywords LLM agentspipeline patchingcausal diagnosislinguistic co-adaptationrouting modulequery rewritingerror distribution

0 comments

The pith

Fixing the diagnosed bottleneck module in LLM agent pipelines often harms performance rather than helping it.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

When multi-module LLM agents fail, the module identified as the main cause through causal analysis is not always the right one to repair. Across three agent families the routing module that chooses the next tool emerges as the primary bottleneck, yet adding correction examples directly to its prompt reliably lowers performance. Patching an upstream query-rewriting module instead produces consistent gains. The authors attribute the asymmetry to the Linguistic Contract: downstream modules have adapted to the typical error patterns of their upstream neighbors, so correcting the bottleneck disrupts that alignment while upstream fixes do not.

Core claim

Causal analysis consistently identifies the routing module as the primary bottleneck across three independent agent families. Yet injecting prompt-level correction examples into this module consistently degrades performance, sometimes severely. Patching an upstream query-rewriting module instead reliably improves outcomes. The effect holds with statistical significance on two agent families and directional consistency on a third. Alternative repair strategies at the routing module are neutral, confirming that the harm is specific to correction-injection patching. The asymmetry is explained by the Linguistic Contract hypothesis that each downstream module implicitly adapts to its upstream's 1

What carries the argument

The Linguistic Contract: the implicit adaptation of each downstream module to the characteristic error distribution produced by its upstream module.

If this is right

Upstream corrections are safer and more effective than direct bottleneck corrections in co-adapted pipelines.
The per-agent co-adaptation measure, computed from diagnosis alone, predicts whether a given patch will harm or help.
Instruction rewriting and model upgrades at the routing module leave performance unchanged, unlike correction injection.
The co-adaptation effect appears consistently across three distinct agent families.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Pipeline designers may need to measure co-adaptation explicitly when deciding where to insert repair logic.
The same adaptation logic could apply to other modular AI systems beyond language agents.
Debugging methods that ignore co-adaptation may systematically recommend counterproductive fixes.

Load-bearing premise

The observed patching asymmetry arises specifically because downstream modules have adapted to the upstream module's characteristic error distribution rather than from prompt sensitivity, model choice, or the form of the correction examples.

What would settle it

Finding a set of agents where the co-adaptation measure is high yet routing-module corrections produce no performance drop, or where the measure is low yet corrections still harm results, would undermine the hypothesis.

Figures

Figures reproduced from arXiv: 2605.21958 by Kim Dongchan, Yoon Jeonghun.

**Figure 1.** Figure 1: Diagnosis Is Not Prescription. Blue (∆Fi): causal contribution of each module on the diagnosis set—the routing module (M3) peaks. Red (−∆CCP @ Mi ): performance change from patching each module—negative at M3 (harm), positive at M1 (improvement). The primary statistically robust finding is the M1-vs.-M3 contrast (p<0.01, Holm-corrected); M1’s standalone improvement does not survive correction (p=0.086). … view at source ↗

**Figure 2.** Figure 2: One top-severity CCP correction per module, drawn from the [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Three M2-amplifier cases (three highest-NIEM2 tasks in Ddiag) drawn from Ddiag. Each row juxtaposes the Planner’s noisy baseline output (M (0) 2 ), its oracle-quality counterpart (S ∗ 2 ), and the actual downstream response. The downstream produces a passable response despite the upstream’s flaws, consistent with the Linguistic Contract (§3.5). Substituting S ∗ 2 for M (0) 2 would strip the absorbed surfac… view at source ↗

read the original abstract

When a multi-module LLM agent fails, the module most responsible for the failure is not necessarily the best place to intervene. We demonstrate this Diagnostic Paradox empirically: causal analysis consistently identifies the routing module -- which selects which tool to call next -- as the primary bottleneck across three independent agent families. Yet injecting prompt-level correction examples into this module consistently degrades performance, sometimes severely. Patching an upstream query-rewriting module instead reliably improves outcomes. The effect holds with statistical significance on two agent families and directional consistency on a third; alternative repair strategies at the routing module (instruction rewriting, model upgrade) are neutral, confirming that the harm is specific to correction-injection patching. We explain this asymmetry through the Linguistic Contract hypothesis: each downstream module implicitly adapts to its upstream's characteristic error distribution, so correcting the bottleneck breaks this implicit alignment in a way that upstream corrections do not. We operationalize this via a per-agent co-adaptation measure, derived from diagnosis alone, and show it is consistently associated with patching harm across agent families: higher co-adaptation co-occurs with harm, lower with safety. This trend holds across all three agent families, providing preliminary support for the hypothesis beyond a single-agent observation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Causal diagnosis flags the routing module as the bottleneck in LLM agents, but patching it with corrections harms performance while upstream fixes help, due to implicit co-adaptation.

read the letter

The main thing to know is that this paper shows causal analysis can steer you toward the wrong fix in multi-module LLM agents. Across three agent families, the routing module that picks the next tool comes out as the primary failure point, yet injecting correction examples there degrades results, sometimes badly. Patching an upstream query-rewriting module instead improves outcomes. They frame this as a Diagnostic Paradox explained by downstream modules adapting to the upstream's typical errors, which they call the Linguistic Contract. A derived co-adaptation measure from the diagnosis data tracks with where the harm appears.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that in multi-module LLM agent pipelines, causal analysis identifies the routing module as the primary bottleneck across three agent families, yet injecting prompt-level correction examples into this module degrades performance while patching an upstream query-rewriting module improves outcomes. The authors explain this Diagnostic Paradox via the Linguistic Contract hypothesis—that downstream modules implicitly adapt to upstream error distributions—and operationalize it with a diagnosis-derived co-adaptation measure that correlates with observed patching harm (higher co-adaptation with harm, lower with safety). The effect is statistically significant in two families and directionally consistent in the third, with alternative routing repairs shown to be neutral.

Significance. If the results hold, this work offers a practically important caution for debugging and repairing modular LLM systems: direct intervention at diagnosed bottlenecks can be counterproductive due to inter-module co-adaptations. The multi-family empirical scope, the introduction of a quantifiable co-adaptation metric, and the contrast with neutral alternative repairs provide a useful framework for understanding patching hazards beyond single-agent observations.

major comments (2)

[Section describing the co-adaptation measure and its association with harm] The co-adaptation measure is described as derived from diagnosis alone and then shown to associate with patching harm; however, without explicit details on its construction (e.g., feature selection, example sampling, or independence from patching outcome data), there is a risk that the measure inadvertently incorporates information correlated with the very harm it is used to explain, weakening support for the Linguistic Contract as a causal account.
[Patching experiments and alternative repair strategies] The patching asymmetry is attributed to downstream adaptation to upstream error distributions, but the manuscript should more rigorously rule out confounds such as prompt sensitivity, the specific form of correction examples, or model choice; the current controls (neutrality of instruction rewriting and model upgrade) are helpful but may not fully isolate the co-adaptation mechanism.

minor comments (2)

[Abstract] The abstract states statistical significance on two families and directional consistency on the third but does not report exact p-values, sample sizes, or multiple-comparison corrections; adding these would improve transparency.
[Causal analysis section] Clarify the precise causal identification procedure used to designate the routing module as the primary bottleneck, including any assumptions or sensitivity checks.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which have helped us clarify and strengthen the presentation of our results. We address each major comment in detail below, providing additional methodological transparency and expanded controls where appropriate. All revisions will be incorporated into the next version of the manuscript.

read point-by-point responses

Referee: [Section describing the co-adaptation measure and its association with harm] The co-adaptation measure is described as derived from diagnosis alone and then shown to associate with patching harm; however, without explicit details on its construction (e.g., feature selection, example sampling, or independence from patching outcome data), there is a risk that the measure inadvertently incorporates information correlated with the very harm it is used to explain, weakening support for the Linguistic Contract as a causal account.

Authors: We appreciate this observation and agree that greater transparency is warranted. The co-adaptation measure is constructed exclusively from the outputs of the causal diagnosis phase: it quantifies the degree of distributional alignment between upstream error patterns (identified via do-interventions on the routing module) and the linguistic features to which downstream modules have adapted. No patching experiment data, outcome labels, or performance metrics enter the measure's computation at any stage. In the revised manuscript we have added a dedicated subsection (Section 3.3) that specifies (i) the exact linguistic features extracted (token-level error type distributions and syntactic divergence scores), (ii) the deterministic sampling procedure used to select representative error examples from the diagnosis traces, and (iii) an explicit independence statement together with pseudocode confirming that the measure is computed prior to and without reference to any patching runs. These additions make the construction fully reproducible and eliminate the possibility of inadvertent leakage from patching outcomes. revision: yes
Referee: [Patching experiments and alternative repair strategies] The patching asymmetry is attributed to downstream adaptation to upstream error distributions, but the manuscript should more rigorously rule out confounds such as prompt sensitivity, the specific form of correction examples, or model choice; the current controls (neutrality of instruction rewriting and model upgrade) are helpful but may not fully isolate the co-adaptation mechanism.

Authors: We acknowledge that the original controls, while informative, leave room for further isolation of the mechanism. In the revision we have added two new sets of experiments. First, we vary the surface form of the correction examples by (a) using paraphrased versions that preserve semantics but alter lexical and syntactic structure and (b) modulating the number of shots (1-shot vs. 5-shot). Second, we repeat the patching protocol on an additional model variant within each agent family (different temperature settings and a second base model of comparable scale). The asymmetry persists across these variations: correction injection at the routing module remains harmful while upstream patching remains beneficial, and the co-adaptation measure continues to predict the direction and magnitude of the effect. We have also expanded the discussion to explicitly address prompt sensitivity and model choice as potential confounds, noting that the observed pattern is robust to these factors. While we cannot claim exhaustive isolation in every conceivable LLM configuration, the expanded controls provide stronger evidence that the harm is tied to the disruption of learned linguistic alignment rather than to the particular prompting or model details tested. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core derivation proceeds from independent causal diagnosis of module bottlenecks, separate patching-intervention experiments that measure performance changes, and a co-adaptation measure explicitly derived from the diagnosis data alone. This measure is then tested for association with the patching outcomes across agent families. Because the measure construction does not incorporate the patching results it is later correlated against, and no self-citation, ansatz smuggling, or definitional equivalence is present in the provided text, the chain remains non-circular and self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central explanation rests on the domain assumption that LLM modules develop implicit alignments to upstream error patterns; no free parameters or invented entities are stated in the abstract.

axioms (1)

domain assumption Downstream modules implicitly adapt to the characteristic error distribution of their upstream modules
This premise is invoked to explain why correcting the bottleneck harms performance while upstream correction does not.

pith-pipeline@v0.9.0 · 5743 in / 1179 out tokens · 39259 ms · 2026-05-22T06:51:00.174578+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Linguistic Contract hypothesis: each downstream module implicitly adapts to its upstream’s characteristic error distribution... compensator rate as proxy for Contract strength
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Causal contribution ∆Fi and NIEi mediation analysis on four-module pipeline

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 2 internal anchors

[1]

Ching-An Cheng, Allen Nie, and Adith Swaminathan. 2024. TRACE is the next AutoDiff : Training compute-optimal language model agents. arXiv preprint arXiv:2406.16218

work page arXiv 2024
[2]

Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts

Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. 2024. DSPy : Compiling declarative language model calls into self-improving pipelines. In International Conference on Learning Representations (ICLR)

work page 2024
[3]

Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, and Karl Cobbe. 2023. Let's verify step by step. arXiv preprint arXiv:2305.20050

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-Eval : NLG evaluation using GPT-4 with better human alignment. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

work page 2023
[5]

Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, and Peter Clark. 2023. Self-refine: Iterative refinement with self-feedback. In Advances in Neural Information Process...

work page 2023
[6]

Judea Pearl. 2009. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University Press

work page 2009
[7]

Judea Pearl. 2014. Interpretation and identification of causal mediation. Psychological Methods, 19(4):459--481

work page 2014
[8]

Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2023
[9]

Alessandro Stolfo, Yonatan Belinkov, and Mrinmaya Sachan. 2023. A mechanistic interpretation of arithmetic reasoning in language models using causal mediation analysis. In Proceedings of EMNLP

work page 2023
[10]

Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber. 2020. Investigating gender bias in language models using causal mediation analysis. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2020
[11]

Le, Denny Zhou, and Xinyun Chen

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. 2024. Large language models as optimizers. In International Conference on Learning Representations (ICLR)

work page 2024
[12]

Shunyu Yao, Noah Shinn, Pedram Razavi, and Karthik Narasimhan. 2024. -bench : A benchmark for tool-agent-user interaction in real-world domains. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2024
[13]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct : Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR)

work page 2023
[14]

Mert Yuksekgonul, Federico Bianchi, Joseph Boen, Sheng Liu, Zhi Huang, Carlos Guestrin, and James Zou. 2024. TextGrad : Automatic ``differentiation'' via text. arXiv preprint arXiv:2406.07496

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023. Large language models are human-level prompt engineers. In International Conference on Learning Representations (ICLR)

work page 2023

[1] [1]

Ching-An Cheng, Allen Nie, and Adith Swaminathan. 2024. TRACE is the next AutoDiff : Training compute-optimal language model agents. arXiv preprint arXiv:2406.16218

work page arXiv 2024

[2] [2]

Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts

Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. 2024. DSPy : Compiling declarative language model calls into self-improving pipelines. In International Conference on Learning Representations (ICLR)

work page 2024

[3] [3]

Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, and Karl Cobbe. 2023. Let's verify step by step. arXiv preprint arXiv:2305.20050

work page internal anchor Pith review Pith/arXiv arXiv 2023

[4] [4]

Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-Eval : NLG evaluation using GPT-4 with better human alignment. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

work page 2023

[5] [5]

Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, and Peter Clark. 2023. Self-refine: Iterative refinement with self-feedback. In Advances in Neural Information Process...

work page 2023

[6] [6]

Judea Pearl. 2009. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University Press

work page 2009

[7] [7]

Judea Pearl. 2014. Interpretation and identification of causal mediation. Psychological Methods, 19(4):459--481

work page 2014

[8] [8]

Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2023

[9] [9]

Alessandro Stolfo, Yonatan Belinkov, and Mrinmaya Sachan. 2023. A mechanistic interpretation of arithmetic reasoning in language models using causal mediation analysis. In Proceedings of EMNLP

work page 2023

[10] [10]

Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, and Stuart Shieber. 2020. Investigating gender bias in language models using causal mediation analysis. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2020

[11] [11]

Le, Denny Zhou, and Xinyun Chen

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. 2024. Large language models as optimizers. In International Conference on Learning Representations (ICLR)

work page 2024

[12] [12]

Shunyu Yao, Noah Shinn, Pedram Razavi, and Karthik Narasimhan. 2024. -bench : A benchmark for tool-agent-user interaction in real-world domains. In Advances in Neural Information Processing Systems (NeurIPS)

work page 2024

[13] [13]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct : Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR)

work page 2023

[14] [14]

Mert Yuksekgonul, Federico Bianchi, Joseph Boen, Sheng Liu, Zhi Huang, Carlos Guestrin, and James Zou. 2024. TextGrad : Automatic ``differentiation'' via text. arXiv preprint arXiv:2406.07496

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023. Large language models are human-level prompt engineers. In International Conference on Learning Representations (ICLR)

work page 2023