arxiv: 2604.12806 · v1 · submitted 2026-04-14 · 💻 cs.LG

Recognition: unknown

Interpretable Relational Inference with LLM-Guided Symbolic Dynamics Modeling

Xiaoxiao Liang , Juyuan Zhang , Liming Pan , Linyuan L\"u

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:59 UTC · model grok-4.3

classification 💻 cs.LG

keywords relational inferencesymbolic regressionlarge language modelsinteraction networksdynamical systemsepidemic modelinginterpretable machine learning

0 comments

The pith

COSINE jointly recovers interaction graphs and sparse symbolic equations by letting an LLM adapt the mathematical library during optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to solve the inverse problem of inferring hidden interaction structures and explicit dynamical rules from observed time series in many-body systems. Standard neural models deliver accurate predictions but hide the underlying mechanisms inside opaque graphs, while classical symbolic regression requires a fixed library of functions and often assumes the topology is already known. COSINE addresses both limitations by running an inner differentiable loop that co-optimizes network edges and sparse symbolic expressions, then feeding performance signals to an outer large-language-model loop that prunes useless terms and proposes new ones. When the claim holds, researchers obtain compact, human-readable equations whose terms align with known physical or biological mechanisms rather than black-box predictors alone.

Core claim

COSINE is a differentiable co-optimization framework that simultaneously learns interaction graphs and sparse symbolic dynamical expressions; an outer-loop large language model adaptively prunes and expands the function library using feedback from the inner optimization, yielding both accurate structural recovery and compact, mechanism-aligned equations on synthetic benchmarks and large-scale epidemic data.

What carries the argument

The COSINE co-optimization loop that alternates between updating network edges and symbolic coefficients while an LLM uses inner-loop performance to revise the candidate function library.

If this is right

Accurate recovery of hidden interaction structures becomes possible without assuming a fixed topology in advance.
Dynamical equations remain sparse and directly readable, aligning with known mechanisms in the data-generating process.
The same pipeline works on both small synthetic systems and large real-world epidemic records.
Interpretability is achieved without sacrificing the accuracy previously available only from black-box neural surrogates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be tested on climate or neural population data where partial mechanistic knowledge exists but full symbolic forms are unknown.
If LLM bias remains small across domains, the need for hand-crafted function libraries in scientific modeling would decrease.
Scaling tests on higher-dimensional systems would reveal whether sparsity is preserved when the number of variables grows.

Load-bearing premise

The true dynamics admit a sparse symbolic representation whose quality can be judged reliably by LLM feedback without systematic bias in the library suggestions.

What would settle it

Apply the method to a known dynamical system whose governing terms lie outside both the initial library and the LLM's subsequent proposals; recovery should then produce either dense non-sparse expressions or visibly incorrect interaction graphs.

Figures

Figures reproduced from arXiv: 2604.12806 by Juyuan Zhang, Liming Pan, Linyuan L\"u, Xiaoxiao Liang.

**Figure 1.** Figure 1: Conceptual comparison of relational inference, symbolic regression, and COSINE. Relational inference recovers latent interaction graphs from dynamical data; symbolic regression discovers governing equations of dynamics given known structures; COSINE integrates sparse symbolic modeling with graph learning to jointly infer both structure and dynamics. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p0… view at source ↗

**Figure 2.** Figure 2: The COSINE architecture. (Left) LLM-based Reasoning refines the basis library Θ(·) via performance feedback. (Middle) Graph-Guided Modeling co-optimizes latent edges Aij and coefficients W through differentiable symbolic message-passing. (Right) Symbolic Basis Library bridges high-level reasoning with numerical discovery of governing mechanisms. flow generates interaction messages from neighbors, while the… view at source ↗

**Figure 4.** Figure 4: Abalation of LLM-guided evolution. COSINE vs. fixed library (w/o update) and threshold-based pruning (λ) [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 3.** Figure 3: Results on COVID-19 epidemic data. Left: Mobility graph and inferred graph. Right: Learned weights of dominant terms in the Message and Update module across four U.S. states. 4.2. Results on Real-World Epidemic Data We apply COSINE to COVID-19 data from four U.S. states (Section C) to jointly infer latent transmission structures and governing equations. The inferred graphs (Figure 3a) reveal pronounced reg… view at source ↗

**Figure 6.** Figure 6: Computational efficiency and memory scaling comparison between COSINE and NRI (batch size 64, hidden dimension 256) across different network sizes N ∈ {5, 10, 50, 100, 200}. (a) Average epoch time, (b) Peak GPU memory usage. Note the logarithmic scale on the vertical axes. framework over a symbolic basis library, which bypasses the extensive parameter optimization required for highdimensional neural repr… view at source ↗

**Figure 7.** Figure 7: Visual illustration of Michaelis–Menten kinetics. a, Node activities (with the trajectories of the first 5 nodes highlighted by colored lines). b, Adjacent matrix of the network. its neighbors. The global system dynamics are expressed as: x˙ = −βLx, L = Din − A (21) where β > 0 denotes the global diffusion rate (or thermal conductivity). In our experiments, we set β = 1.0. L is the directed graph Laplacian… view at source ↗

**Figure 8.** Figure 8: Visual illustration of Diffusion dynamics. a, Node activities (with the trajectories of the first 5 nodes highlighted by colored lines). b, Adjacent matrix of the network. B.1.3. NETWORK OF SPRINGS (SPRING) The network of springs (Spring or Spr) is a continuous-time second-order dynamical system derived from classical mechanics. It treats each node as a unit mass coupled to its neighbors by linear springs … view at source ↗

**Figure 9.** Figure 9 [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Visual illustration of Kuramoto model. a, Node activities (with the trajectories of the first 5 nodes highlighted by colored lines). b, Adjacent matrix of the network. c, Natural frequencies ωi of each node. B.1.5. FRIEDKIN–JOHNSEN DYNAMICS (FJ) Friedkin–Johnsen (FJ) dynamics (Friedkin & Johnsen, 1990) is a classic discrete-time opinion formation model on graphs, balancing social influence and innate opin… view at source ↗

**Figure 11.** Figure 11: Visual illustration of Friedkin–Johnsen dynamics. a, Node activities (with the trajectories of the first 5 nodes highlighted by colored lines). b, Adjacent matrix of the network. c, Innate opinions si of each node. B.1.6. COUPLED MAP NETWORK (CMN) Coupled map networks (CMN) (Garcia et al., 2002) are discrete-time dynamical systems used to model spatiotemporal chaos and self-organization in complex systems… view at source ↗

**Figure 12.** Figure 12: Visual illustration of Coupled map network. a, Node activities (with the trajectories of the first 5 nodes highlighted by colored lines). b, Adjacent matrix of the network. B.2. Baselines Granger Causality (GC) Granger Causality (Granger, 1969) evaluates the directed functional connectivity between nodes by measuring the predictive gain. Specifically, node j is said to ”Granger-cause” node i if the future… view at source ↗

**Figure 13.** Figure 13: Evolution of discovered basis terms for MM and CMN systems. Panels (a, b) show Message term weights, while (c, d) show Update term weights for the first (blue) and best (red) rounds. and parsimonious physical representations, which is crucial for modeling complex nonlinear networked dynamics. B.4. Mechanistic Discovery via Iterative Library Refinement: A Case Study To further illustrate the dynamics of th… view at source ↗

**Figure 14.** Figure 14: Closed-loop evolution of the basis library in the Kuramoto system in the first 4 rounds (a-d). The first row displays the training loss (red) and structural inference AUC (blue) over four optimization rounds. The second and third rows illustrate the distribution of learned coefficients (importance) for candidate terms in the message and update libraries, respectively. Note the sharp rise in AUC following … view at source ↗

**Figure 15.** Figure 15: Epidemiological data for Arizona. (a) Geographic map and county-level interaction network. (b) Ridgeline plot of daily COVID-19 confirmed cases across counties from February 2020 to February 2023. C.2.2. CONNECTICUT Connecticut represents a compact, high-density system with only 8 counties. The network in Figure 16a is relatively dense given its small size, with strong inter-county ties reflecting frequen… view at source ↗

**Figure 16.** Figure 16: b is notable for its extreme synchronization; while earlier waves were relatively subdued, the early 2022 Omicron peak is exceptionally sharp and uniform across all counties (e.g., Fairfield, Hartford, New Haven). This dataset serves as a benchmark for testing relational inference in small-scale, highly coupled systems. a Connecticut b COVID-19 confirmed cases in Connecticut [PITH_FULL_IMAGE:figures/full… view at source ↗

**Figure 17.** Figure 17: Epidemiological data for Illinois. (a) Geographic map and county-level interaction network. (b) Ridgeline plot showing the high-density case distribution across over 100 counties. C.2.4. MICHIGAN The Michigan dataset includes 83 counties, with its geographic structure constrained by the state’s unique peninsula topography. As seen in Figure 18a, the network density is highest in the southeast (the Detroit… view at source ↗

**Figure 18.** Figure 18: Epidemiological data for Michigan. (a) Geographic map and county-level interaction network. (b) Ridgeline plot of daily COVID-19 confirmed cases illustrating regional seasonal variations. 26 [PITH_FULL_IMAGE:figures/full_fig_p026_18.png] view at source ↗

read the original abstract

Inferring latent interaction structures from observed dynamics is a fundamental inverse problem in many-body interacting systems. Most neural approaches rely on black-box surrogates over trainable graphs, achieving accuracy at the expense of mechanistic interpretability. Symbolic regression offers explicit dynamical equations and stronger inductive biases, but typically assumes known topology and a fixed function library. We propose \textbf{COSINE} (\textbf{C}o-\textbf{O}ptimization of \textbf{S}ymbolic \textbf{I}nteractions and \textbf{N}etwork \textbf{E}dges), a differentiable framework that jointly discovers interaction graphs and sparse symbolic dynamics. To overcome the limitations of fixed symbolic libraries, COSINE further incorporates an outer-loop large language model that adaptively prunes and expands the hypothesis space using feedback from the inner optimization loop. Experiments on synthetic systems and large-scale real-world epidemic data demonstrate robust structural recovery and compact, mechanism-aligned dynamical expressions. Code: https://anonymous.4open.science/r/COSINE-6D43.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

COSINE adds an LLM outer loop to adapt the symbolic library while jointly inferring graphs and dynamics, but the abstract gives no ablations or numbers so the robustness claim stays unanchored.

read the letter

The main thing to know is that this paper puts forward COSINE, a setup that co-optimizes the interaction graph and the symbolic equations for the dynamics, with an outer LLM loop that uses inner-loop performance to prune or grow the function library. That combination is not just another symbolic regression wrapper; it tries to relax the usual fixed-library assumption while still producing explicit, inspectable equations and recovered edges. On synthetic systems and epidemic data the abstract claims the method recovers structure and yields compact, mechanism-aligned expressions. If those results hold with real metrics, the approach could be useful for domains like epidemiology or physics where black-box graph neural nets fall short on interpretability. The paper does a reasonable job framing the gap between neural relational inference and standard symbolic methods, and the joint optimization plus LLM adaptation is a coherent extension rather than a trivial one. The soft spots are exactly where the stress-test note points. The abstract reports no quantitative recovery rates, no error bars, no ablation against a fixed library, and no checks on whether the LLM's library edits are consistent or biased toward short or familiar expressions. Without those controls the central claim that the outer loop reliably improves the hypothesis space rests on an untested assumption. The free parameters listed (sparsity weight, LLM temperature) also suggest the method could be sensitive to choices that are not explored in the summary. This work is aimed at researchers who already work on inverse problems in interacting systems and want explicit equations rather than just predictions. A reader who needs a new tool for sparse graph-plus-symbolic recovery could extract value from the framework even if the current experiments need tightening. It deserves a serious referee because the core idea is well-motivated and internally consistent; the gaps are fixable with added controls and numbers rather than fatal. I would send it to peer review and ask reviewers to focus on the LLM loop validation and baseline comparisons.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces COSINE, a differentiable co-optimization framework for jointly inferring latent interaction graphs and sparse symbolic dynamical equations from time-series observations. An inner optimization loop fits graph edges and symbolic expressions from a function library, while an outer-loop LLM adaptively prunes and expands that library using performance feedback. The central claim is that this yields robust structural recovery and compact, mechanism-aligned expressions on synthetic systems and large-scale epidemic data.

Significance. If the claims hold after addressing the noted gaps, the work would offer a meaningful advance in interpretable relational inference by relaxing the fixed-library assumption common in symbolic regression while retaining explicit dynamical forms. The joint graph-symbolic optimization and LLM-guided adaptation represent a novel combination with potential applicability to epidemic modeling and other interacting systems. Credit is due for providing anonymous code and for targeting both topology and mechanism discovery in one framework.

major comments (3)

[Abstract] Abstract: the claim of 'robust structural recovery' is presented without any quantitative metrics, error bars, baseline comparisons, or ablation results, which is load-bearing for evaluating performance given the variability introduced by LLM guidance.
[Method] Method section (outer LLM loop): the assumption that LLM feedback reliably selects an unbiased, high-quality function library is central to the interpretability benefit, yet no controls, inter-LLM consistency checks, prompt-robustness tests, or comparisons against fixed libraries are described; without these the reported compactness on epidemic data could be an artifact of LLM bias toward familiar or short expressions.
[Experiments] Experiments: the abstract states results on synthetic systems and epidemic data but provides no details on library sizes, sparsity regularization values, or how LLM temperature/selection criteria were chosen, leaving the free parameters unexamined and the robustness claim difficult to reproduce or falsify.

minor comments (2)

[Method] Notation for the joint objective and LLM feedback signal should be defined more explicitly with equations to clarify the inner/outer loop interaction.
[Experiments] The epidemic dataset description would benefit from a table summarizing size, variables, and ground-truth interaction structure if available.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback and positive assessment of COSINE's potential contribution to interpretable relational inference. We agree that strengthening the abstract, adding robustness controls for the LLM component, and providing explicit experimental details will improve clarity and reproducibility. Below we respond point-by-point to the major comments and indicate the planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of 'robust structural recovery' is presented without any quantitative metrics, error bars, baseline comparisons, or ablation results, which is load-bearing for evaluating performance given the variability introduced by LLM guidance.

Authors: We agree that the abstract would be stronger with concrete quantitative support for the 'robust structural recovery' claim. The Experiments section already reports these metrics (including error bars, baselines, and ablations), but the abstract summarizes at a high level without them. We will revise the abstract to include key results such as structural recovery accuracies with standard deviations and comparisons to baselines, making the performance claims directly evaluable. revision: yes
Referee: [Method] Method section (outer LLM loop): the assumption that LLM feedback reliably selects an unbiased, high-quality function library is central to the interpretability benefit, yet no controls, inter-LLM consistency checks, prompt-robustness tests, or comparisons against fixed libraries are described; without these the reported compactness on epidemic data could be an artifact of LLM bias toward familiar or short expressions.

Authors: This correctly highlights a gap in validating the LLM-guided adaptation. The manuscript presents the outer-loop LLM as enabling adaptive library pruning/expansion, but lacks explicit controls or comparisons. We will add a new subsection with: (i) direct comparisons of COSINE against fixed-library baselines, (ii) inter-run consistency results across multiple LLM calls, and (iii) prompt-sensitivity analysis. These will help confirm that compactness and performance gains are not due to LLM bias. revision: yes
Referee: [Experiments] Experiments: the abstract states results on synthetic systems and epidemic data but provides no details on library sizes, sparsity regularization values, or how LLM temperature/selection criteria were chosen, leaving the free parameters unexamined and the robustness claim difficult to reproduce or falsify.

Authors: We acknowledge the need for full hyperparameter transparency to support reproducibility. While the Methods and Experiments sections describe the overall library construction, optimization, and LLM integration, specific values (library sizes per dataset, sparsity regularization coefficients, LLM temperature, and selection criteria) for the epidemic experiments were not tabulated. We will add a detailed table and accompanying text specifying all these parameters, along with any sensitivity checks. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces the COSINE framework as a joint optimization procedure combining graph inference with sparse symbolic regression, augmented by an outer LLM loop for adaptive library pruning based on inner-loop feedback. The abstract and method description present this as an algorithmic construction whose outputs (recovered graphs and expressions) are validated on held-out synthetic and real epidemic data. No equations, uniqueness theorems, or self-citations are invoked that would reduce the claimed structural recovery or mechanism alignment to a fitted parameter or input by construction. The LLM component is treated as an external oracle rather than a self-referential definition, and experimental claims remain falsifiable against independent benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that real dynamical systems are well-approximated by sparse symbolic expressions and that LLM feedback can usefully steer the hypothesis space without circular dependence on the fitted model.

free parameters (2)

sparsity regularization weight
Controls the trade-off between equation simplicity and data fit in the joint optimization.
LLM prompt temperature and selection criteria
Hyperparameters that determine which functions the language model adds or removes.

axioms (1)

domain assumption Observed trajectories are generated by a system whose interactions and dynamics admit a sparse symbolic representation.
Invoked to justify the use of symbolic regression inside the graph inference loop.

pith-pipeline@v0.9.0 · 5482 in / 1272 out tokens · 39830 ms · 2026-05-10T15:59:50.694537+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Brunton, S

Pmlr, 2021. Brunton, S. L., Proctor, J. L., and Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the national academy of sciences, 113(15):3932–3937, 2016a. Brunton, S. L., Proctor, J. L., and Kutz, J. N. Sparse iden- tification of nonlinear dynamics with control (sindyc). IFAC-Pa...

2021
[2]

Learningsymbolic physics with graph networks.arXiv:1909.05862,

World Scientific, 1999. Castellano, C., Fortunato, S., and Loreto, V . Statistical physics of social dynamics.Reviews of Modern Physics, 81(2):591, 2009. Chen, S., Wang, J., and Li, G. Neural relational inference with efficient message passing mechanisms. InProceed- ings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 7055–7063, 2021. Cr...

work page arXiv 1999
[3]

KAN: Kolmogorov-Arnold Networks

Springer, 1975. Lai, Y .-C. Finding nonlinear system equations and complex network structures from data: A sparse optimization ap- proach.Chaos: An Interdisciplinary Journal of Nonlinear Science, 31(8), 2021. Li, Y ., Meng, C., Shahabi, C., and Liu, Y . Structure- informed graph auto-encoder for relational inference and simulation. InICML Workshop on Lear...

work page internal anchor Pith review arXiv 1975
[4]

Respond with a single JSON object and nothing else (no markdown or explanatory text)
[5]

message_terms

The JSON must contain exactly two array fields: "message_terms" and "update_terms". 13 Co-Optimization of Symbolic Interactions and Network Edges - Functions in message_terms are used to model interactions between nodes. - Functions in update_terms are used to model each node’s own state update
[6]

name", "expr

Each item must contain exactly three fields: {"name", "expr", "type"}, where type in {"vector", "scalar"}. - The field name must be exactly "expr". - No extra fields are allowed, no empty field is allowed
[7]

msg" or

Within the same stream, names must be unique. Available tensors and helpers: - Message stream only: xi, xj shape: [B, N, N, D]; diff = xj - xi - Update stream only: x in [B, N, D]; h in [B, N, D]; deg in [B, N, 1] - Shared helpers: torch, F, pi, inf Strict variable scoping rules (NO EXCEPTIONS): - Expressions in message_terms may only use: xi, xj, diff (a...
[8]

Message function (message_terms): - Start with simple and plausible interaction mechanisms (e.g., linear difference ‘xj - xi‘)
[9]

- Keep terms low-order (linear or quadratic) initially

Update function (update_terms): - Include core baseline terms (e.g., ‘x‘, ‘h‘) to represent basic self-dynamics and neighbor influence. - Keep terms low-order (linear or quadratic) initially
[10]

Multi-channel (D>1): - Use simple channel-wise terms before exploring cross-channel interactions. Advanced Exploration (if the system description suggests high complexity): - If the system is known to be highly non-linear (e.g., biological, chemical), you may include terms from the following categories: - Trigonometric: sin(.) / cos(.) on diff or state. -...

work page arXiv 2008