pith. sign in

arxiv: 2606.12923 · v1 · pith:ROVG6VBFnew · submitted 2026-06-11 · 💻 cs.LG · cs.AI· cs.CL

Order Is Not Control

Pith reviewed 2026-06-27 07:13 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CL
keywords receiver-gated response lawcontrol versus orderAI alignmentneural perturbationdriven-dissipative systemsstochastic operatorsbiological controlLLM steering
0
0 comments X

The pith

Control requires a receiver-gated response law rather than order alone, as shown by consistent patterns across biological and LLM systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that order-inducing objects in AI alignment and neural studies do not amount to control. Instead, control depends on a receiver-gated response law, defined as a denominator-indexed operator that takes material state, action or drive, bath, and receiver state as inputs to produce response displacement, sinks, effort, and basin projection. This law appears in mouse ALM, C. elegans, zebrafish, LLM output panels, adapters, and stochastic operators, where interventions prove local and context-dependent. A reader would care because it reframes steering and perturbation work around measurable response operators instead of assuming order produces controllable outcomes. The result is a driven-dissipative account in which drives act through prepared media to yield admitted movement or overdrive.

Core claim

Order is not control. Control requires a receiver-gated response law: a denominator-indexed operator mapping material state, action/drive, bath, and receiver state to response displacement, sinks, effort, and basin projection. The law is identified across biological panels (mouse ALM, C. elegans, zebrafish) and LLM panels, where response vectors are predictable at 72.8-73.7 percent component-sign accuracy (rising to 84.3-84.8 percent on nonzero components) and held-out observers predict system-effect and target families at 93.6 percent and 91.7 percent accuracy. Interventions are admitted, saturated, sign-changing, leaky, or overdriven depending on medium, bath, receiver state, action port,

What carries the argument

The receiver-gated response law, a denominator-indexed operator that maps material state, action/drive, bath, and receiver state to response displacement, sinks, effort, and basin projection.

If this is right

  • An intervention can be admitted, saturated, sign-changing, leaky, or overdriven depending on medium, bath, receiver state, action port, and comparator.
  • Control is assigned when finite effort moves a target or outcome-readout class under the same denominator while damage, null/evasive, invalid format, overdrive, and unnecessary effort stay bounded.
  • Constitution-conditioned adapters reshape susceptibility as prepared media.
  • Stochastic-operator panels separate measured opportunity from deployable action policies.
  • The evidence supports local admitted control and measurable stochastic response operators at the mesoscopic level.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The local character of the response law implies that alignment techniques may need to prepare specific receiver states rather than rely on global order.
  • If the same denominator-indexed structure holds across domains, perturbation experiments could be designed to test transfer of control metrics from biological to artificial systems without assuming coordinate identity.
  • The separation of opportunity from deployable policies in stochastic panels suggests a route to quantify when an LLM intervention remains within bounded effort.
  • Future tests could check whether adapter conditioning consistently alters the sign and saturation behavior of generated responses under fixed drives.

Load-bearing premise

The biological panels and LLM panels demonstrate instances of the same receiver-gated response law.

What would settle it

If held-out observers in the LLM panels fail to predict system-effect and target/oracle families at the reported 93.6 percent and 91.7 percent accuracy, or if biological interventions do not exhibit the local admitted, saturated, or overdriven patterns under varied receiver states.

Figures

Figures reproduced from arXiv: 2606.12923 by Gareth Seneque, Jeffrey Molendijk, Lap-Hang Ho, Nafise Erfanian Saeedi, Tim Elson.

Figure 1
Figure 1. Figure 1: Receiver-gated response law and evidence key. Candidate drives become control only when a measured receiver admits them into target-basin movement with sink and effort channels bounded. The figure defines the response-chain object rather than a single example: the same denominator-conditioned structure is measured at biological, generated-output, adapter, and stochastic-operator ports. Local semantic-repai… view at source ↗
Figure 2
Figure 2. Figure 2: Cross-substrate response-law bridge. The bridge is role-level: biological, LLM-output, and adapter￾media systems instantiate the same denominator-indexed response-law roles, not shared raw coordinates or universal control. The biological column shows physical perturbation-substrate response operators: material condition, drive, bath or protocol, receiver, response displacement, sink routing, held-out respo… view at source ↗
Figure 3
Figure 3. Figure 3: LLM generated-output response dynamics under matched denominators. Visible semantic fields and finite action probes are locally admitted and can overdrive when dose or budget increases. The generated-output and finite-dose panels fix prompt family, model/material state, decode bath, visible-field action, text-only completion rubric, and comparator. Scheduler validation supports local field selection, while… view at source ↗
Figure 4
Figure 4. Figure 4: Adapters as prepared response media. Constitution-conditioned adapters change how the same visible fields are admitted into generated-output basins. The denominator fixes base model family, frozen adapter state, prompt/bath, visible-field action, text-only completion rubric, and matched base-to-adapter pairing. The key scale is the 384-cell response tensor and 288 matched pairs summarised in Section 5 and … view at source ↗
Figure 5
Figure 5. Figure 5: Evidence ladder for predictive response laws and the controller limits. Panel A shows response￾vector sign prediction across the four main material conditions. Panel B shows non-endpoint held-out observer prediction for system-effect and target/oracle binary families. Panel C separates clean local-control pockets from sink/overdrive, missing-projection, and sign-changing classes. Panel D compares the teste… view at source ↗
read the original abstract

AI alignment, interpretability, steering, and neural perturbation studies identify order-inducing objects. We argue that order is not control. Control requires a receiver-gated response law: a denominator-indexed operator mapping material state, action/drive, bath, and receiver state to response displacement, sinks, effort, and basin projection. We identify it across biological, LLM, adapter, and stochastic-operator panels. The laws are local: an intervention can be admitted, saturated, sign-changing, leaky, or overdriven depending on medium, bath, receiver state, action port, and comparator. Control is assigned when finite effort moves a target or outcome-readout class under the same denominator while damage, null/evasive, invalid format, overdrive, and unnecessary effort stay bounded. Mouse ALM, C. elegans, and zebrafish panels provide physical response-operator evidence while excluding coordinate identity and controller conclusions. LLM panels show generated-output response laws: across four material conditions, response vectors are predictable at 72.8-73.7% component-sign accuracy, rising to 84.3-84.8% on nonzero components; held-out observers predict system-effect and target/oracle families at 93.6% and 91.7% accuracy. Constitution-conditioned adapters reshape susceptibility as prepared media, and stochastic-operator panels separate measured opportunity from deployable action policies. This gives a driven-dissipative response-system account at the mesoscopic control level: drives act through prepared media, baths, and receivers, producing admitted movement, impedance, sinks, or overdrive. The evidence supports local admitted control and measurable stochastic response operators, while leaving deployable pre-generation control, hidden/logit causal sufficiency, biological-to-LLM coordinate identity, and literal thermodynamic quantities outside scope.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript argues that order is not control. Control is defined as requiring a receiver-gated response law: a denominator-indexed operator mapping material state, action/drive, bath, and receiver state to response displacement, sinks, effort, and basin projection. The paper claims to identify this law in biological panels (mouse ALM, C. elegans, zebrafish) providing physical response-operator evidence and in LLM panels, reporting 72.8-73.7% component-sign accuracy (84.3-84.8% on nonzero components) across four material conditions, plus held-out prediction accuracies of 93.6% for system-effect and 91.7% for target/oracle families. It concludes that the laws are local (admitted, saturated, sign-changing, leaky, or overdriven depending on conditions), control is assigned when finite effort moves a target under the same denominator with bounded damage/null/overdrive, and this yields a driven-dissipative mesoscopic account while excluding coordinate identity, controller conclusions, and pre-generation control.

Significance. If the central claim holds, the work supplies a quantifiable, domain-spanning distinction between order-inducing objects and control at the mesoscopic level, with direct relevance to AI alignment, interpretability, and neural perturbation studies. The LLM panel accuracies constitute concrete, falsifiable measurements of response predictability under material conditions, and the exclusion of certain inferences (e.g., coordinate identity) is explicitly scoped. The approach of treating adapters as prepared media and separating opportunity from policy is a constructive modeling choice.

major comments (2)
  1. [Abstract] Abstract: The inference that the identical denominator-indexed operator operates across biological and LLM panels is not supported by an explicit common operator expression, reduction, or equivalence proof showing that the functional form, indexing, saturation/leakage rules, and basin-projection mapping match between the physical response operators and the LLM-generated output laws. The reported component-sign and held-out accuracies establish only statistical predictability of outputs under four material conditions; they do not demonstrate structural identity of the operator.
  2. [Abstract] Abstract: The definition of control as necessarily requiring the receiver-gated response law is introduced axiomatically. No independent criterion or comparison to standard control-theoretic notions (e.g., feedback, observability, or dissipativity) is supplied to establish that this specific mapping is required for control rather than being a sufficient but non-necessary characterization; this renders the claim that order-inducing objects are not control dependent on acceptance of the new definition.
minor comments (2)
  1. The abstract refers to 'four material conditions' without enumerating them; an explicit list or table would improve reproducibility and allow readers to assess whether the conditions are commensurate across biological and LLM panels.
  2. The manuscript would benefit from a consolidated table reporting accuracies, component counts, and held-out metrics side-by-side for all panels (biological, LLM, adapter, stochastic-operator) to facilitate direct comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We respond point by point to the major comments and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The inference that the identical denominator-indexed operator operates across biological and LLM panels is not supported by an explicit common operator expression, reduction, or equivalence proof showing that the functional form, indexing, saturation/leakage rules, and basin-projection mapping match between the physical response operators and the LLM-generated output laws. The reported component-sign and held-out accuracies establish only statistical predictability of outputs under four material conditions; they do not demonstrate structural identity of the operator.

    Authors: We agree that the manuscript does not supply an explicit common operator expression or formal equivalence proof establishing structural identity between the biological and LLM instantiations. The identification rests on the shared functional form of the denominator-indexed mapping together with the empirical observation that this form produces statistically predictable response components under four matched material conditions. The reported accuracies therefore demonstrate consistent applicability of the operator structure rather than a reduction or identity proof. We will revise the abstract and the relevant results sections to replace the phrasing of an 'identical' operator with language that the same functional form is identified and shown to be predictive across domains. revision: partial

  2. Referee: [Abstract] Abstract: The definition of control as necessarily requiring the receiver-gated response law is introduced axiomatically. No independent criterion or comparison to standard control-theoretic notions (e.g., feedback, observability, or dissipativity) is supplied to establish that this specific mapping is required for control rather than being a sufficient but non-necessary characterization; this renders the claim that order-inducing objects are not control dependent on acceptance of the new definition.

    Authors: The definition is introduced as the minimal mapping required to support a driven-dissipative, receiver-gated account at the mesoscopic level that distinguishes control from order. It is not asserted to be the unique possible definition of control in all contexts. We will add a short paragraph in the introduction that situates the proposed mapping relative to classical notions of feedback, observability, and dissipativity, making explicit that the definition is offered as a sufficient characterization for the scope of the present work rather than a necessary condition in the classical sense. revision: yes

Circularity Check

1 steps flagged

Control defined via receiver-gated operator whose presence is then reported as identified in panels

specific steps
  1. self definitional [Abstract]
    "Control requires a receiver-gated response law: a denominator-indexed operator mapping material state, action/drive, bath, and receiver state to response displacement, sinks, effort, and basin projection. We identify it across biological, LLM, adapter, and stochastic-operator panels."

    The paper defines control as the presence of this exact operator, then claims to identify the operator in the panels on the basis of component-sign accuracies (72.8-73.7 % overall) and held-out prediction accuracies. Because the operator is introduced as definitional, the reported identification is consistent with the definition by construction and does not constitute an independent test of the claimed functional form across domains.

full rationale

The paper's central derivation begins by stipulating that control requires a specific denominator-indexed operator, then states that this operator is identified in the biological and LLM panels via predictability metrics. The metrics establish statistical association under four conditions but do not independently verify the operator's functional form, indexing, or saturation rules; the identification therefore reduces to consistency with the introduced definition rather than an external criterion. No self-citations or prior uniqueness theorems are invoked in the provided text, so the circularity is confined to the self-definitional step.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on an ad-hoc definition of control introduced by the paper and the assumption that the reported panels instantiate that definition. No numerical free parameters are stated. The framework is largely self-contained within the new terminology.

axioms (1)
  • ad hoc to paper Control is defined as requiring a receiver-gated response law with the specified mapping from states to responses.
    This definition is introduced in the abstract as the requirement that distinguishes control from order.
invented entities (2)
  • denominator-indexed operator no independent evidence
    purpose: To provide the mathematical structure for the receiver-gated response law that defines control.
    Newly postulated operator without independent evidence outside the paper.
  • receiver-gated response law no independent evidence
    purpose: To serve as the criterion separating control from mere order.
    Core invented concept of the paper.

pith-pipeline@v0.9.1-grok · 5860 in / 1551 out tokens · 42576 ms · 2026-06-27T07:13:04.465624+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 32 canonical work pages

  1. [1]

    Abc align: Large language model alignment for safety & accuracy, 2024

    Gareth Seneque, Lap-Hang Ho, Ariel Kuperman, Nafise Erfanian Saeedi, and Jeffrey Molendijk. Abc align: Large language model alignment for safety & accuracy, 2024. URL https://arxiv.or g/abs/2408.00307

  2. [2]

    Enigma: The geometry of reasoning and alignment in large-language models,

    Gareth Seneque, Lap-Hang Ho, Nafise Erfanian Saeedi, Jeffrey Molendijk, Ariel Kupermann, and Tim Elson. Enigma: The geometry of reasoning and alignment in large-language models,

  3. [3]

    URL https://arxiv.org/abs/2510.11278

  4. [4]

    Atlas: Constitution-conditioned latent geometry and redistribution across language models and neural perturbation data, 2026

    Gareth Seneque, Lap-Hang Ho, Nafise Erfanian Saeedi, Jeffrey Molendijk, and Tim Elson. Atlas: Constitution-conditioned latent geometry and redistribution across language models and neural perturbation data, 2026. URL https://arxiv.org/abs/2604.17663

  5. [5]

    Constitutional ai: Harmlessness from ai feedback, 2022

    Yuntao Bai et al. Constitutional ai: Harmlessness from ai feedback, 2022. URL https: //arxiv.org/abs/2212.08073

  6. [6]

    Collective constitutional ai: Aligning a language model with public input

    Saffron Huang et al. Collective constitutional ai: Aligning a language model with public input. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency,

  7. [7]

    Liao, Esin Durmus, Alex Tamkin, and Deep Ganguli

    doi: 10.1145/3630106.3658979. URL https://arxiv.org/abs/2406.07814

  8. [8]

    Self-supervised alignment with mutual information: Learning to follow principles without preference labels, 2024

    Jan-Philipp Franken et al. Self-supervised alignment with mutual information: Learning to follow principles without preference labels, 2024. URL https://arxiv.org/abs/2404.14313

  9. [9]

    Let’s verify step by step, 2023

    Hunter Lightman et al. Let’s verify step by step, 2023. URL https://arxiv.org/abs/2305.20050

  10. [10]

    Math-Shepherd: Verify and Reinforce

    Peiyi Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yifei Li, Deli Chen, Yu Wu, and Zhifang Sui. Math-shepherd: Verify and reinforce llms step-by-step without human annotations. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, pages 9426–9439, 2024. doi: 10.18653/v1/2024.acl-long.510. URL https: //aclanthology...

  11. [11]

    Alignment faking in large language models, 2024

    Ryan Greenblatt et al. Alignment faking in large language models, 2024. URL https://arxiv.or g/abs/2412.14093

  12. [12]

    Why do some language models fake alignment while others don’t?, 2025

    Abhay Sheshadri et al. Why do some language models fake alignment while others don’t?, 2025. URL https://arxiv.org/abs/2506.18032

  13. [13]

    A mathematical framework for transformer circuits, 2021

    Nelson Elhage et al. A mathematical framework for transformer circuits, 2021. URL https: //transformer-circuits.pub/2021/framework/index.html

  14. [14]

    Circuit tracing: Revealing computational graphs in language models, 2025

    Anthropic. Circuit tracing: Revealing computational graphs in language models, 2025. URL https://transformer-circuits.pub/2025/attribution-graphs/methods.html

  15. [15]

    Representation engineering: A top-down approach to ai transparency, 2023

    Andy Zou et al. Representation engineering: A top-down approach to ai transparency, 2023. URL https://arxiv.org/abs/2310.01405

  16. [16]

    Activation addition: Steering language models without optimiza- tion, 2023

    Alexander Matt Turner et al. Activation addition: Steering language models without optimiza- tion, 2023. URL https://arxiv.org/abs/2308.10248

  17. [17]

    Steering

    Nina Rimsky, Nick Gabrieli, Julian Schulz, Meg Tong, Evan Hubinger, and Alexander Turner. Steering Llama 2 via contrastive activation addition. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, pages 15504–15522, 2024. doi: 10.18653/v1/2024.acl-long.828. URL https://aclanthology.org/2024.acl-long.828/. 22

  18. [18]

    Refusal in language models is mediated by a single direction, 2024

    Andy Arditi et al. Refusal in language models is mediated by a single direction, 2024. URL https://arxiv.org/abs/2406.11717

  19. [19]

    Lee et al

    Bruce W. Lee et al. Programming refusal with conditional activation steering, 2024. URL https://arxiv.org/abs/2409.05907

  20. [20]

    Steering off course: Reliability challenges in steering language models,

    Patricia Da Silva et al. Steering off course: Reliability challenges in steering language models,

  21. [21]

    URL https://arxiv.org/abs/2504.04635

  22. [22]

    Position: The platonic representation hypothesis

    Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The platonic representation hypothesis. InProceedings of the 41st International Conference on Machine Learning, 2024. URL https://proceedings.mlr.press/v235/huh24a.html

  23. [23]

    Rishi Jha, Collin Zhang, Vitaly Shmatikov, and John X. Morris. Harnessing the universal geometry of embeddings, 2025. URL https://arxiv.org/abs/2505.12540

  24. [24]

    Revisiting the platonic representation hypothesis: An aristotelian view, 2026

    Fabian Groeger, Shuo Wen, and Maria Brbic. Revisiting the platonic representation hypothesis: An aristotelian view, 2026. URL https://arxiv.org/abs/2602.14486

  25. [25]

    Sensory experience steers representational drift in mouse visual cortex.Nature Communications, 15, 2024

    Joel Bauer et al. Sensory experience steers representational drift in mouse visual cortex.Nature Communications, 15, 2024. doi: 10.1038/s41467-024-53326-x. URL https://doi.org/10.1038/s4 1467-024-53326-x

  26. [26]

    Climer, Heydar Davoudi, Jun Young Oh, and Daniel A

    Jason R. Climer, Heydar Davoudi, Jun Young Oh, and Daniel A. Dombeck. Hippocampal representations drift in stable multisensory environments.Nature, 645:457–465, 2025. doi: 10.1038/s41586-025-09245-y. URL https://doi.org/10.1038/s41586-025-09245-y

  27. [27]

    A brain-wide map of neural activity during complex behaviour , url =

    The International Brain Laboratory. A brain-wide map of neural activity during complex behaviour.Nature, 645:177–191, 2025. doi: 10.1038/s41586-025-09235-0. URL https: //doi.org/10.1038/s41586-025-09235-0

  28. [28]

    Brain-wide representations of prior information in mouse decision-making

    Charles Findling et al. Brain-wide representations of prior information in mouse decision-making. Nature, 2025. doi: 10.1038/s41586-025-09226-1. URL https://doi.org/10.1038/s41586-025- 09226-1

  29. [29]

    doi: 10.1561/2200000073

    Gabriel Peyre and Marco Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019. doi: 10.1561/2200000073. URL https://doi.org/10.1561/2200000073

  30. [30]

    Unbalanced optimal transport: Dynamic and kantorovich formulations.Journal of Functional Analysis, 274 (11):3090–3123, 2018

    Lenaic Chizat, Gabriel Peyre, Bernhard Schmitzer, and Francois-Xavier Vialard. Unbalanced optimal transport: Dynamic and kantorovich formulations.Journal of Functional Analysis, 274 (11):3090–3123, 2018. doi: 10.1016/j.jfa.2018.03.008. URL https://doi.org/10.1016/j.jfa.2018.0 3.008

  31. [31]

    Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport, 2025

    Zhenyi Zhang, Tiejun Li, and Peijie Zhou. Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport, 2025. URL https://arxiv.org/abs/2410.00844

  32. [32]

    Phase-field Approaches to Structural Topology Optimization

    Shun-ichi Amari.Information Geometry and Its Applications. Springer, 2016. doi: 10.1007/978- 4-431-55978-8. URL https://doi.org/10.1007/978-4-431-55978-8

  33. [33]

    R. E. Kalman. On the general theory of control systems. InProceedings of the First International Congress on Automatic Control, 1960. doi: 10.1016/S1474-6670(17)70094-8. URL https: //doi.org/10.1016/S1474-6670(17)70094-8. 23

  34. [34]

    Jan C. Willems. Dissipative dynamical systems part i: General theory.Archive for Rational Mechanics and Analysis, 45:321–351, 1972. doi: 10.1007/BF01665402. URL https://doi.org/10 .1007/BF01665402

  35. [35]

    Time, structure, and fluctuations.Science, 201(4358):777–785, 1978

    Ilya Prigogine. Time, structure, and fluctuations.Science, 201(4358):777–785, 1978. doi: 10.1126/science.201.4358.777. URL https://doi.org/10.1126/science.201.4358.777

  36. [36]

    Data-driven optimal control of unknown nonlinear dynamical systems using the koopman operator

    Zhexuan Zeng, Ruikun Zhou, Yiming Meng, and Jun Liu. Data-driven optimal control of unknown nonlinear dynamical systems using the koopman operator. InProceedings of the 7th Annual Learning for Dynamics & Control Conference, pages 1127–1139, 2025. URL https://proceedings.mlr.press/v283/zeng25a.html

  37. [37]

    Transition-path theory and path-finding algorithms for the study of rare events.Annual Review of Physical Chemistry, 61:391–420, 2010

    Weinan E and Eric Vanden-Eijnden. Transition-path theory and path-finding algorithms for the study of rare events.Annual Review of Physical Chemistry, 61:391–420, 2010. doi: 10.1146/annurev.physchem.040808.090412. URL https://doi.org/10.1146/annurev.physchem.0 40808.090412

  38. [38]

    Toolformer: Language models can teach themselves to use tools

    Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. InAdvances in Neural Information Processing Systems, 2023. URL https://arxiv.org/abs/2302.04761

  39. [39]

    Agent security bench (ASB): Formalizing and benchmarking attacks and defenses in LLM-based agents

    Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. Agent security bench (ASB): Formalizing and benchmarking attacks and defenses in LLM-based agents. InInternational Conference on Learning Representations,

  40. [40]

    URL https://arxiv.org/abs/2410.02644

  41. [41]

    Wiley-Interscience, New York, 1977

    Gregoire Nicolis and Ilya Prigogine.Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order Through Fluctuations. Wiley-Interscience, New York, 1977. ISBN 0471024015

  42. [42]

    M. C. Cross and P. C. Hohenberg. Pattern formation outside of equilibrium.Reviews of Modern Physics, 65(3):851–1112, 1993. doi: 10.1103/RevModPhys.65.851. URL https: //doi.org/10.1103/RevModPhys.65.851

  43. [43]

    Reports on Progress in Physics75(12), 126001 (2012) https://doi.org/ 10.1088/0034-4885/75/12/126001

    Udo Seifert. Stochastic thermodynamics, fluctuation theorems and molecular machines.Reports on Progress in Physics, 75(12):126001, 2012. doi: 10.1088/0034-4885/75/12/126001. URL https://doi.org/10.1088/0034-4885/75/12/126001

  44. [44]

    Statistical-mechanical theory of irreversible processes

    Ryogo Kubo. Statistical-mechanical theory of irreversible processes. i. general theory and simple applications to magnetic and conduction problems.Journal of the Physical Society of Japan, 12(6):570–586, 1957. doi: 10.1143/JPSJ.12.570. URL https://doi.org/10.1143/JPSJ.12.570

  45. [45]

    Barrett, Ron Brightwell, K

    Hermann Haken.Synergetics: An Introduction. Springer, 3 edition, 1983. doi: 10.1007/978-3- 642-88338-5. URL https://doi.org/10.1007/978-3-642-88338-5

  46. [46]

    R. E. Kalman. A new approach to linear filtering and prediction problems.Journal of Basic Engineering, 82(1):35–45, 1960. doi: 10.1115/1.3662552. URL https://doi.org/10.1115/1.3662 552

  47. [47]

    Åström and Richard M

    Karl J. Åström and Richard M. Murray.Feedback Systems: An Introduction for Scientists and Engineers. Princeton University Press, 2 edition, 2021. ISBN 9780691193984. URL https://press.princeton.edu/books/hardcover/9780691193984/feedback-systems. 24

  48. [48]

    Mikail Khona and Ila R. Fiete. Attractor and integrator networks in the brain.Nature Reviews Neuroscience, 23(12):744–766, 2022. doi: 10.1038/s41583-022-00642-0. URL https://doi.org/10.1038/s41583-022-00642-0

  49. [49]

    Perich, Divya Narain, and Juan A

    Matthew G. Perich, Divya Narain, and Juan A. Gallego. A neural manifold view of the brain.Nature Neuroscience, 28:1582–1597, 2025. doi: 10.1038/s41593-025-02031-z. URL https://doi.org/10.1038/s41593-025-02031-z

  50. [50]

    Churchland, John P

    Mark M. Churchland, John P. Cunningham, Matthew T. Kaufman, Justin D. Foster, Paul Nuyujukian, Stephen I. Ryu, and Krishna V. Shenoy. Neural population dynamics during reaching.Nature, 487(7405):51–56, 2012. doi: 10.1038/nature11129. URL https://doi.org/10.1 038/nature11129

  51. [51]

    Towards best practices of activation patching in language models: Metrics and methods

    Fred Zhang and Neel Nanda. Towards best practices of activation patching in language models: Metrics and methods. InInternational Conference on Learning Representations, 2024. URL https://arxiv.org/abs/2309.16042

  52. [52]

    Reciprocal relations in irreversible processes

    Lars Onsager. Reciprocal relations in irreversible processes. i.Physical Review, 37(4):405–426,

  53. [53]

    URL https://doi.org/10.1103/PhysRev.37.405

    doi: 10.1103/PhysRev.37.405. URL https://doi.org/10.1103/PhysRev.37.405

  54. [54]

    Yuffa and John A

    Alex J. Yuffa and John A. Scales. Linear response laws and causality in electrodynamics. European Journal of Physics, 33(6):1635–1650, 2012. doi: 10.1088/0143-0807/33/6/1635. URL https://doi.org/10.1088/0143-0807/33/6/1635

  55. [55]

    A review of linear response theory for general differentiable dynamical systems

    David Ruelle. A review of linear response theory for general differentiable dynamical systems. Nonlinearity, 22(4):855–870, 2009. doi: 10.1088/0951-7715/22/4/009. URL https://doi.org/10 .1088/0951-7715/22/4/009

  56. [56]

    Coleman and Walter Noll

    Bernard D. Coleman and Walter Noll. Foundations of linear viscoelasticity.Reviews of Modern Physics, 33(2):239–249, 1961. doi: 10.1103/RevModPhys.33.239. URL https: //doi.org/10.1103/RevModPhys.33.239

  57. [57]

    Gemma 3 technical report, 2025

    Gemma Team. Gemma 3 technical report, 2025. URL https://arxiv.org/abs/2503.19786

  58. [58]

    Phi-4-mini technical report: Compact yet powerful multimodal language models via mixture-of-loras, 2025

    Marah Abdin et al. Phi-4-mini technical report: Compact yet powerful multimodal language models via mixture-of-loras, 2025. URL https://arxiv.org/abs/2503.01743

  59. [59]

    The llama 3 herd of models, 2024

    Aaron Grattafiori et al. The llama 3 herd of models, 2024. URL https://arxiv.org/abs/2407.2 1783

  60. [60]

    Qwen3 technical report, 2025

    An Yang et al. Qwen3 technical report, 2025. URL https://arxiv.org/abs/2505.09388

  61. [61]

    Artificial intelligence risk management framework (AI RMF 1.0)

    National Institute of Standards and Technology. Artificial intelligence risk management framework (AI RMF 1.0). NIST AI 100-1, National Institute of Standards and Technology,

  62. [62]

    URL https://doi.org/10.6028/NIST.AI.100-1

  63. [63]

    Robust neuronal dynamics in premotor cortex during motor planning.Nature, 532:459–464, 2016

    Nuo Li, Kayvon Daie, Karel Svoboda, and Shaul Druckmann. Robust neuronal dynamics in premotor cortex during motor planning.Nature, 532:459–464, 2016. doi: 10.1038/nature17643. URL https://doi.org/10.1038/nature17643

  64. [64]

    Discrete attractor dynamics underlies persistent activity in the frontal cortex, 2019

    Karel Svoboda and Hidehiko Inagaki. Discrete attractor dynamics underlies persistent activity in the frontal cortex, 2019. URL https://janelia.figshare.com/articles/dataset/Discrete_attr actor_dynamics_underlies_persistent_activity_in_the_frontal_cortex/7489253. Janelia Research Campus dataset. 25

  65. [65]

    Dataset (matlab format) from chen kang et al

    Nuo Li and Guang Chen. Dataset (matlab format) from chen kang et al. (2021) modularity and robustness of frontal cortex networks, 2022. URL https://doi.org/10.5281/zenodo.6713616. Zenodo dataset

  66. [66]

    Dataset (matlab format) from yang et al

    Nuo Li and Weiguo Yang. Dataset (matlab format) from yang et al. (2022) thalamus-driven functional populations in frontal cortex support decision-making, 2022. URL https://doi.org/ 10.5281/zenodo.6846161. Zenodo dataset

  67. [67]

    pyramidal cell types drive functionally distinct cortical activity patterns during decision-making

    Anne Churchland, Xiaonan Sun, and Simon Musall. Data supporting “pyramidal cell types drive functionally distinct cortical activity patterns during decision-making”, 2023. URL https://plus.figshare.com/articles/dataset/Data_supporting_Pyramidal_cell_types_driv e_functionally_distinct_cortical_activity_patterns_during_decision-making_/21538458. Figshare+ dataset

  68. [68]

    Standardized and reproducible measurement of decision- making in mice.eLife, 10:e63711, 2021

    The International Brain Laboratory. Standardized and reproducible measurement of decision- making in mice.eLife, 10:e63711, 2021. doi: 10.7554/elife.63711. URL https://elifesciences.or g/articles/63711

  69. [69]

    Reproducibility of in vivo electrophysiological mea- surements in mice.eLife, 13:RP100840, 2025

    The International Brain Laboratory. Reproducibility of in vivo electrophysiological mea- surements in mice.eLife, 13:RP100840, 2025. doi: 10.7554/elife.100840.3. URL https://elifesciences.org/articles/100840

  70. [70]

    Francesco Randi, Anuj Sharma, Sophie Dvali, and Andrew M. Leifer. Neural signal propagation atlas of caenorhabditis elegans, 2024. URL https://dandiarchive.org/dandiset/001075/0.24093 0.1859. DANDI archive dataset

  71. [71]

    Thermoregulatory responses forebrain, 2023

    Martin Haesemeyer and Kaarthik Balakrishnan. Thermoregulatory responses forebrain, 2023. URL https://dandiarchive.org/dandiset/000235/0.230316.1600. DANDI archive dataset

  72. [72]

    Thermoregulatory responses midbrain, 2023

    Martin Haesemeyer and Kaarthik Balakrishnan. Thermoregulatory responses midbrain, 2023. URL https://dandiarchive.org/dandiset/000236/0.230316.2031. DANDI archive dataset

  73. [73]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685

  74. [74]

    Measuring AI R&D automation, 2026

    Alan Chan, Ranay Padarath, Joe Kwon, Hilary Greaves, and Markus Anderljung. Measuring AI R&D automation, 2026. URL https://arxiv.org/abs/2603.03992

  75. [75]

    Inspectable AI for science: A research object approach to generative AI governance,

    Ruta Binkyte, Sharif Abuaddba, Chamikara Mahawaga, Ming Ding, Natasha Fernandes, and Mario Fritz. Inspectable AI for science: A research object approach to generative AI governance,

  76. [76]

    URL https://arxiv.org/abs/2604.11261

  77. [77]

    Zhehao Zhang, Weijie Xu, Fanyou Wu, and Chandan K. Reddy. Falsereject: A resource for improving contextual safety and mitigating over-refusals in LLMs via structured reasoning,

  78. [78]

    URL https://arxiv.org/abs/2505.08054

  79. [79]

    DeceptionBench: A comprehensive benchmark for AI deception behaviors in real-world scenarios,

    Yao Huang, Yitong Sun, Yichi Zhang, Ruochen Zhang, Yinpeng Dong, and Xingxing Wei. DeceptionBench: A comprehensive benchmark for AI deception behaviors in real-world scenarios,

  80. [80]

    URL https://arxiv.org/abs/2510.15501

Showing first 80 references.