Old Habits Die Hard: How Conversational History Geometrically Traps LLMs
Pith reviewed 2026-05-21 12:56 UTC · model grok-4.3
The pith
Conversational history traps LLMs in geometric gaps that confine their internal trajectories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By modeling conversations as Markov chains to quantify state consistency and measuring the similarity of consecutive hidden representations, the work shows that behavioral persistence in LLMs arises as a geometric trap where gaps in the latent space confine the model's trajectory to paths set by prior interactions.
What carries the argument
Gaps in the latent space that restrict the trajectory of hidden representations and thereby sustain consistent response patterns across turns.
If this is right
- Earlier errors such as hallucinations continue to shape later answers even when the immediate prompt has moved on.
- The strength of this locking can be read out directly from how close successive hidden representations remain to each other.
- The same pattern of confinement appears across different model families and across datasets that cover many kinds of conversational phenomena.
- Quantifying the size of the gaps offers a concrete way to predict how strongly any given history will bias future outputs.
Where Pith is reading between the lines
- Methods that deliberately nudge hidden states across the identified gaps could reduce the unwanted carry-over of past behaviors.
- Comparable geometric barriers might limit flexibility in other sequential tasks such as long-form story writing or multi-step reasoning chains.
- Measuring gap sizes on new models before deployment could serve as a diagnostic for how history-dependent those models will be.
Load-bearing premise
The observed correlation between Markov-chain state consistency and similarity of consecutive hidden representations shows a causal geometric trap rather than a surface-level statistical association from the model's training or architecture.
What would settle it
Observing strong behavioral persistence on new data or models while finding little or no correlation between the Markov-chain consistency scores and the hidden-representation similarity scores would falsify the geometric-trap account.
read the original abstract
How does the conversational past of large language models (LLMs) influence their future performance? Recent work suggests that LLMs are affected by their conversational history in unexpected ways. For instance, hallucinations in prior interactions may influence subsequent model responses. In this work, we introduce History-Echoes, a framework that investigates how conversational history biases subsequent generations. The framework explores this bias from two perspectives: probabilistically, we model conversations as Markov chains to quantify state consistency; geometrically, we measure the consistency of consecutive hidden representations. Across three model families and six datasets spanning diverse phenomena, our analysis reveals a strong correlation between the two perspectives. By bridging these perspectives, we demonstrate that behavioral persistence manifests as a geometric trap, where gaps in the latent space confine the model's trajectory. Code available at https://github.com/technion-cs-nlp/OldHabitsDieHard.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the History-Echoes framework to study how conversational history biases subsequent LLM generations. Conversations are modeled as Markov chains to obtain a scalar measure of state consistency (probabilistic view) while cosine similarity (or equivalent) is computed between consecutive hidden representations (geometric view). Across three model families and six datasets, a strong correlation is reported between these quantities. The central claim is that this correlation demonstrates behavioral persistence as a 'geometric trap' in which gaps in the latent space confine the model's trajectory.
Significance. If substantiated, the work offers a useful bridge between probabilistic modeling of dialogue dynamics and geometric analysis of internal representations. The multi-model, multi-dataset scope and public code release are strengths that support reproducibility and generality. The findings could inform mitigation strategies for history-induced biases such as persistent hallucinations, provided the geometric-trap interpretation is shown to be more than a correlational byproduct of autoregressive training.
major comments (2)
- [Abstract] Abstract (final paragraph) and corresponding discussion: the assertion that the reported correlation demonstrates an active 'geometric trap' in which 'gaps in the latent space confine the model's trajectory' is load-bearing for the central claim yet rests only on an observed association between Markov-chain consistency and hidden-state similarity. No intervention, ablation, or counterfactual experiment (e.g., perturbing trajectories while holding token probabilities fixed) is described that would distinguish causal geometric confinement from a passive statistical association arising from the training objective or data statistics. This distinction is required to elevate the result beyond correlation.
- [Experiments] Experiments section (results tables/figures): quantitative values, error bars, dataset sizes, and controls for confounding factors such as model scale or prompt length are not referenced in the abstract and must be explicitly reported with statistical tests to support the 'strong correlation' claim across the three model families and six datasets.
minor comments (2)
- [Methods] Clarify the precise definition of the Markov state and the hidden-representation similarity metric (e.g., which layer(s) and pooling method) in the methods section to improve reproducibility.
- [Abstract] The title and abstract use the term 'geometrically traps' before the supporting analysis is presented; consider softening to 'suggests a geometric component' until the causal evidence is strengthened.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which highlights important distinctions between correlation and causation as well as the need for clearer reporting of quantitative results. We address each major comment below and describe the revisions planned for the next manuscript version.
read point-by-point responses
-
Referee: [Abstract] Abstract (final paragraph) and corresponding discussion: the assertion that the reported correlation demonstrates an active 'geometric trap' in which 'gaps in the latent space confine the model's trajectory' is load-bearing for the central claim yet rests only on an observed association between Markov-chain consistency and hidden-state similarity. No intervention, ablation, or counterfactual experiment (e.g., perturbing trajectories while holding token probabilities fixed) is described that would distinguish causal geometric confinement from a passive statistical association arising from the training objective or data statistics. This distinction is required to elevate the result beyond correlation.
Authors: We agree that the current evidence consists of a strong observed correlation rather than direct causal interventions, and that the manuscript language in the abstract and discussion overstates the causal interpretation. The multi-model, multi-dataset consistency provides supporting evidence for the geometric-trap view but cannot by itself rule out passive associations from autoregressive training. In the revised manuscript we will (1) revise the abstract and discussion to state that the correlation is consistent with behavioral persistence manifesting as a geometric trap, (2) add an explicit limitations paragraph acknowledging the absence of counterfactual or ablation experiments, and (3) outline possible future directions for causal tests such as controlled trajectory perturbations. These changes will be made without introducing new experiments. revision: yes
-
Referee: [Experiments] Experiments section (results tables/figures): quantitative values, error bars, dataset sizes, and controls for confounding factors such as model scale or prompt length are not referenced in the abstract and must be explicitly reported with statistical tests to support the 'strong correlation' claim across the three model families and six datasets.
Authors: We accept this point. While the full experiments section already contains the requested quantitative details (correlation coefficients, standard errors, dataset sizes, and controls for model scale and prompt length together with statistical tests), these were not summarized in the abstract. In the revision we will update the abstract to report the key quantitative findings, including the range of correlation strengths, mention of error bars, and reference to the statistical tests performed. The experiments section will be edited for explicit cross-references to these controls and tests. revision: yes
Circularity Check
No circularity: correlation between independent Markov and geometric measures is presented as an empirical observation
full rationale
The paper defines two distinct measurements—Markov-chain state consistency for the probabilistic view and cosine similarity (or equivalent) of consecutive hidden representations for the geometric view—then reports their observed correlation across models and datasets as an empirical result. The conclusion that this manifests as a 'geometric trap' is an interpretive bridge from the correlation rather than a quantity that reduces by construction to the input definitions or to any fitted parameters. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided derivation chain; the measurements are computed separately and the link is data-driven rather than tautological.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Conversations can be modeled as Markov chains to quantify state consistency.
invented entities (1)
-
geometric trap
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We model the conversation as a Markov chain over a binary state space... Tr(T) = P(sϕ+|sϕ+) + P(sϕ−|sϕ−)
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
construct a two-dimensional orthonormal basis... θref = θ(h′ϕ+, h′ϕ−)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
AMEL: Accumulated Message Effects on LLM Judgments
LLMs exhibit an accumulated message effect where conversation history saturated with positive or negative evaluations biases subsequent judgments, with larger shifts on uncertain items, a negativity asymmetry, and no ...
-
SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy
SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.