Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning
Pith reviewed 2026-05-16 17:10 UTC · model grok-4.3
The pith
Agent-Dice separates shared knowledge from task-specific gradient conflicts to let LLM agents learn new tasks without forgetting prior ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Agent-Dice executes a two-stage fusion on parameter updates: geometric consensus filtering removes gradients that disagree with the direction of the current model parameters, after which curvature-based importance weighting amplifies the remaining updates that encode semantics shared across tasks. This explicit disentanglement of common knowledge from task-specific interference is shown both theoretically to be valid and empirically to mitigate catastrophic forgetting in LLM-based agents operating on GUI and tool-use sequences.
What carries the argument
Directional consensus evaluation, which compares the angle between incoming gradient vectors and the existing parameter trajectory to prune conflicts and reweight shared content.
If this is right
- Agents trained sequentially on new interfaces or tools retain high accuracy on earlier ones without replay buffers.
- Only small additional computation is needed because the method operates on gradients already computed during normal fine-tuning.
- The stability-plasticity trade-off is reframed as a geometric separation problem rather than an optimization trade-off.
- Parameter counts stay close to the base model size because no task-specific adapters or heads are stored.
Where Pith is reading between the lines
- The same consensus filter could be tested on non-agent continual-learning benchmarks such as sequential image classification to check domain generality.
- If curvature weighting proves robust, the approach might combine with low-rank adapters to further cut storage while preserving the separation effect.
- Extending the directional test from gradients to attention-key or value updates could address forgetting in transformer internals rather than just final-layer weights.
Load-bearing premise
Measuring agreement in gradient direction reliably isolates common knowledge from interfering task-specific signals without introducing systematic bias.
What would settle it
Run the same sequence of GUI and tool-use tasks while replacing the directional-consensus filter with random gradient pruning of equal size; if prior-task accuracy collapses at the same rate as the unfiltered baseline, the separation claim is falsified.
read the original abstract
Large Language Model (LLM)-based agents significantly extend the utility of LLMs by interacting with dynamic environments. However, enabling agents to continually learn new tasks without catastrophic forgetting remains a critical challenge, known as the stability-plasticity dilemma. In this work, we argue that this dilemma fundamentally arises from the failure to explicitly distinguish between common knowledge shared across tasks and conflicting knowledge introduced by task-specific interference. To address this, we propose Agent-Dice, a parameter fusion framework based on directional consensus evaluation. Concretely, Agent-Dice disentangles knowledge updates through a two-stage process: geometric consensus filtering to prune conflicting gradients, and curvature-based importance weighting to amplify shared semantics. We provide a rigorous theoretical analysis that establishes the validity of the proposed fusion scheme and offers insight into the origins of the stability-plasticity dilemma. Extensive experiments on GUI agents and tool-use agent domains demonstrate that Agent-Dice exhibits outstanding continual learning performance with minimal computational overhead and parameter updates. The codes are available at https://github.com/Wuzheng02/Agent-Dice.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the stability-plasticity dilemma in continual learning for LLM-based agents stems from failing to distinguish common knowledge from task-specific conflicting knowledge. It proposes Agent-Dice, a two-stage parameter fusion method consisting of geometric consensus filtering to prune conflicting gradients followed by curvature-based importance weighting to amplify shared semantics. The authors provide a rigorous theoretical analysis establishing the validity of the fusion scheme and report strong experimental results on GUI agents and tool-use agent domains, with minimal computational overhead and parameter updates.
Significance. If the geometric consensus reliably separates shared and conflicting updates without bias, the work would provide both a practical low-overhead method for agent continual learning and theoretical insight into the origins of catastrophic forgetting. The public code release supports reproducibility and potential follow-up work.
major comments (2)
- [Section 3] Section 3: The stability-plasticity bounds are derived under the assumption that task-specific updates lie outside a cone around the mean gradient direction. This cone condition is load-bearing for the pruning step; when violated by partial directional overlap (common in LLM fine-tuning where representations share components), the filtering may retain interference or discard useful shared updates. No explicit robustness analysis or counter-example handling is provided.
- [Section 4] Section 4 (experiments): The claim of 'outstanding' performance relies on the filtering and weighting steps being the causal factor, yet the manuscript lacks ablation results isolating the effect of the directional consensus threshold or curvature weighting under controlled interference levels. Without these, it is unclear whether gains exceed standard regularization baselines.
minor comments (2)
- [Abstract] Abstract: The code repository link is given, but the manuscript should specify which scripts reproduce the exact tables and figures.
- [Section 3] Notation: The definitions of the consensus direction and curvature weight should be stated explicitly with equation numbers in the main text rather than deferred to appendices.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate additional analysis and experiments as suggested.
read point-by-point responses
-
Referee: [Section 3] Section 3: The stability-plasticity bounds are derived under the assumption that task-specific updates lie outside a cone around the mean gradient direction. This cone condition is load-bearing for the pruning step; when violated by partial directional overlap (common in LLM fine-tuning where representations share components), the filtering may retain interference or discard useful shared updates. No explicit robustness analysis or counter-example handling is provided.
Authors: We acknowledge that the cone condition is central to the theoretical bounds and pruning step. While the analysis in Section 3 derives the bounds under this assumption to characterize the stability-plasticity trade-off, we agree that robustness to partial directional overlap merits explicit treatment. In the revised version, we will add a dedicated subsection with (i) a theoretical extension bounding the residual interference under mild cone violations and (ii) synthetic counter-examples on low-dimensional gradient vectors that simulate partial overlap. These additions will show that the subsequent curvature-based weighting step mitigates retained interference, preserving the method's practical utility even when the strict cone condition is only approximately satisfied in LLM fine-tuning. revision: yes
-
Referee: [Section 4] Section 4 (experiments): The claim of 'outstanding' performance relies on the filtering and weighting steps being the causal factor, yet the manuscript lacks ablation results isolating the effect of the directional consensus threshold or curvature weighting under controlled interference levels. Without these, it is unclear whether gains exceed standard regularization baselines.
Authors: We agree that stronger causal evidence is desirable. The current experiments already compare Agent-Dice against standard regularization baselines (EWC, SI, and MAS) and report consistent gains with low overhead, but they do not isolate the two stages under controlled interference. In the revision we will add targeted ablations that (i) sweep the directional consensus threshold while fixing curvature weighting and (ii) vary the curvature importance factor while disabling filtering, all under synthetic interference levels generated by controlled gradient overlap. These results will be presented alongside the existing baseline comparisons to demonstrate that the performance improvements are attributable to the proposed geometric consensus and curvature mechanisms rather than generic regularization. revision: yes
Circularity Check
No significant circularity; derivation remains independent of inputs
full rationale
The paper introduces a two-stage fusion method (directional consensus pruning followed by curvature weighting) and claims a theoretical analysis establishing its validity. No equations or steps in the abstract or description reduce any claimed prediction or bound to a fitted parameter or self-citation by construction. The stability-plasticity insight is presented as arising from the proposed disentanglement rather than being presupposed in the inputs. Self-citations, if any, are not load-bearing for the core claims, and the method is externally falsifiable via experiments on GUI and tool-use agents. This is the standard non-circular case.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Gradient directions can be used to evaluate consensus and separate common knowledge from task-specific interference
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
geometric consensus filtering to prune conflicting gradients, and curvature-based importance weighting to amplify shared semantics
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 2 (Consensus-Induced Variance Reduction) ... Hoeffding’s inequality ... exp(−2|Sj|(p−0.5)2)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.