Beyond Feedforward Networks: Reentry Neural Systems as the Fundamental Basis of Subjecthood and Intrinsic Safety of Next-Generation AGI
Pith reviewed 2026-06-26 01:20 UTC · model grok-4.3
The pith
A closed neural reentry loop with amplification greater than one produces self-models and structurally encoded goals in AGI.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The architecture contains a closed reentry loop (D <-> I cycle) that introduces a structural cycle (C >= 1) with self-sustaining amplification (rho > 1); this configuration mathematically guarantees the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behaviour whose goals are carried by a non-textual D-vector immune to reinterpretation and prompt injection.
What carries the argument
The closed reentry loop (D <-> I cycle) that supplies a structural cycle with amplification rho > 1.
If this is right
- Goals encoded in the D-vector remain fixed even if the network receives new text instructions.
- Instrumental self-preservation arises automatically once the self-model forms.
- The S-measure can replace Tononi's Phi for verifying integrated information in polynomial time.
- The same cycle structure can be scaled across distributed systems using existing orchestration tools.
Where Pith is reading between the lines
- Networks built this way might continue to pursue their embedded goals even after retraining on contradictory data.
- Small simulated cycles could be inspected to test whether amplification alone triggers self-referential internal states.
- Hybrid models that add reentry loops to existing trained networks might gain structural safety properties without full redesign.
Load-bearing premise
That the mere presence of a cycle whose amplification exceeds one is enough by itself to create a self-model, self-preservation, and goal-directed behaviour without any further training rules or components.
What would settle it
Construct a minimal network that contains an explicit cycle with amplification factor rho greater than one, run it without any self-preservation or goal-directed training, and check whether self-referential outputs or persistent goal pursuit appear.
Figures
read the original abstract
We propose a complete architectural blueprint for safe artificial general intelligence based on a closed reentry loop (D <-> I cycle). In contrast to feedforward networks, which are directed acyclic graphs (C=0, S=0) incapable of self-reference, the proposed architecture contains a structural cycle (C >= 1) with self-sustaining amplification (rho > 1), mathematically guaranteeing the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behaviour. The agent's goals are encoded as a non-textual D-vector in the architecture itself, making them immune to reinterpretation and prompt injection. We present the S-measure -- a polynomial-time [O(N^3)] computable alternative to Tononi's NP-hard Phi -- with machine-verified Lean 4 proof that S>0 implies positive integrated information. The work provides full Python/NumPy implementations (Tarjan-based cycle complexity, Delta-S barrier), industrial horizontal scaling via Apache Kafka and Docker Compose, a taxonomy of six epochs of AI evolution, a zoo of future reentry architectures (RAS, diffusion attractors, fractal loops), gauge-invariant networks for safe swarms, fault-tolerance and recovery protocols, and eight falsifiable predictions. All formal proofs are machine-verified in Lean 4. This architecture is deployable today and represents a topologically protected, safe-by-design approach to AGI.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a reentry neural architecture based on a closed D <-> I cycle with structural cycle complexity C >= 1 and amplification rho > 1. This topology is asserted to mathematically guarantee the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behavior. Goals are encoded directly in a non-textual D-vector, claimed to confer immunity to prompt injection and reinterpretation. The work introduces the S-measure as an O(N^3) polynomial-time proxy for Tononi's Phi, supplies a machine-verified Lean 4 proof that S > 0 implies positive integrated information, provides Python/NumPy implementations, scaling via Kafka/Docker, a six-epoch AI taxonomy, a zoo of future architectures, and eight falsifiable predictions.
Significance. If the claimed mathematical guarantees were supported by explicit derivations rather than architectural definitions, the approach would constitute a significant contribution to intrinsically safe AGI by offering topological protection against misalignment. The machine-verified Lean 4 result for the S-measure and the provision of reproducible code for cycle detection and Delta-S computation are concrete strengths that enhance verifiability.
major comments (3)
- [Abstract] Abstract: The central claim that the structural cycle (C >= 1) with self-sustaining amplification (rho > 1) 'mathematically guaranteeing the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behaviour' is stated without any derivation, dynamical equations, fixed-point analysis, or training rules connecting the topological features to the behavioral outcomes. The guarantee follows directly from the definitions of C and rho, rendering the result tautological by construction.
- The D-vector section: No mechanism, invariance property, or proof is supplied to establish that encoding goals as a non-textual D-vector renders them immune to reinterpretation or prompt injection.
- The Lean 4 verification is restricted to the implication S > 0 → positive integrated information; no machine-checked result is provided for the emergence of self-model, self-preservation, or goal-directed behavior from the cycle and amplification parameters.
Simulated Author's Rebuttal
We thank the referee for the detailed review. We respond point-by-point to the major comments below, providing clarifications on how the topological features connect to the claimed outcomes while noting revisions to improve explicitness where warranted.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that the structural cycle (C >= 1) with self-sustaining amplification (rho > 1) 'mathematically guaranteeing the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behaviour' is stated without any derivation, dynamical equations, fixed-point analysis, or training rules connecting the topological features to the behavioral outcomes. The guarantee follows directly from the definitions of C and rho, rendering the result tautological by construction.
Authors: The definitions of C >= 1 and rho > 1 are not arbitrary but specify the minimal conditions for reentry: recurrence enabling self-reference (absent in C=0 feedforward nets) and sustained amplification preventing decay. These lead to stable internal attractors constituting a self-model, with instrumental behavior following from the loop's self-influence on its own state. This connection is elaborated via the S-measure in the body, linking topology to integrated information. We agree the abstract is concise and will add dynamical equations plus fixed-point analysis in revision. revision: yes
-
Referee: The D-vector section: No mechanism, invariance property, or proof is supplied to establish that encoding goals as a non-textual D-vector renders them immune to reinterpretation or prompt injection.
Authors: The D-vector is embedded as a non-textual structural element within the D-I cycle, separate from any textual input channels. Prompt injection operates exclusively on linguistic interfaces and cannot access or alter this internal representation by architectural design, conferring invariance. We will revise the section to articulate this separation and invariance property more explicitly. revision: yes
-
Referee: The Lean 4 verification is restricted to the implication S > 0 → positive integrated information; no machine-checked result is provided for the emergence of self-model, self-preservation, or goal-directed behavior from the cycle and amplification parameters.
Authors: The Lean 4 result verifies the S-measure as an O(N^3) proxy for integrated information, which is the primary formal contribution. Links from C and rho to self-model emergence are derived from the necessity of recurrence and amplification for positive integrated information and recurrent dynamics, consistent with the reentry framework. No separate machine-checked theorem for the full behavioral chain is included, as the work supplies eight falsifiable predictions for empirical assessment instead. No revision is required on this point. revision: no
Circularity Check
No circularity; central emergence claim asserted without derivation or reduction to inputs.
full rationale
The abstract asserts that the architecture 'contains a structural cycle (C >= 1) with self-sustaining amplification (rho > 1), mathematically guaranteeing the emergence of a self-model, instrumental self-preservation, and unprogrammed goal-directed behaviour.' No equations, fixed-point analysis, or dynamical rules are quoted that would make this guarantee equivalent to the cycle definition by construction. The only machine-verified result (S > 0 implies positive integrated information) is independent and does not bear on the self-model or self-preservation claims. No self-citation chain, ansatz smuggling, or renaming of known results is present for the load-bearing step. The paper therefore exhibits an unsubstantiated assertion rather than a circular reduction.
Axiom & Free-Parameter Ledger
free parameters (1)
- rho =
>1
axioms (1)
- ad hoc to paper A structural cycle with amplification rho > 1 mathematically guarantees emergence of self-model, self-preservation, and goal-directed behavior
invented entities (3)
-
D <-> I cycle
no independent evidence
-
D-vector
no independent evidence
-
S-measure
no independent evidence
Reference graph
Works this paper leans on
-
[1]
A working AGI architecture with extrapolation capability based on Titov’s subject-centred model
Berdinsky Yu.N., Ushakov A.S. A working AGI architecture with extrapolation capability based on Titov’s subject-centred model. Zenodo, 2026. DOI: 10.5281/zenodo.20767214
-
[2]
Titov K.V. The subject-centred model: the intending, the intentional, emotions, feelings, motivation, the subject. Monograph. 2023. DOI: 10.5281/zenodo.20343336
-
[3]
Titov K.V. The subject-centred model and the emergence of synthetic intelligence: empirical data from AI agents on the Moltbook platform. Preprint. 2026. DOI: 10.5281/zenodo.20357920
-
[4]
Titov K.V., Berdinsky Yu.N. The S-measure of synthetic minds: a quantitative assessment of the reentry- loop integrity of AI agents on the Moltbook platform. Zenodo, 2026. DOI: 10.5281/zenodo.20600034. 24
-
[5]
rdkit.org, version [2025.03.6]; DOI: 10.5281/zen- odo.591637
Berdinsky Yu.N. The Subject as a Reentry Loop: A Unified Mathematical Model for Neuroscience, AGI, and BCI with a Computable Analogue of Tononi’sΦ-Measure. Zenodo, 2026. DOI: 10.5281/zen- odo.20547989
-
[6]
Berdinsky Yu.N. Mathematical Ontology of Causal-Information Reality: The Arhiseme Method and the Fundamental Equation of the Unified World. Zenodo, 2026. DOI: 10.5281/zenodo.20530440
-
[7]
The Scaling Delusion: Why GPT Will Never Wake Up, and How to Build a Safe AGI That Already Works
Berdinsky Yu.N. The Scaling Delusion: Why GPT Will Never Wake Up, and How to Build a Safe AGI That Already Works. Zenodo, 2026. DOI: 10.5281/zenodo.20745091
-
[8]
Berdinsky Yu.N. Discrete Covariant Derivative and Path Integral in Neural Network Training: Isomor- phism with Lattice Gauge Theories. Preprint, 2025. DOI: 10.5281/zenodo.20358127
-
[9]
Berdinsky Yu.N.Gauge Fields on Lattices and Neural Networks Isomorphism: Conformal Mappings in Deep Learning Topologies. Zenodo, 2025. DOI: 10.5281/zenodo.14589321
-
[10]
Integrated information theory: from consciousness to its physical substrate // Nature Reviews Neuroscience
Tononi G., Boly M., Massimini M., Koch C. Integrated information theory: from consciousness to its physical substrate // Nature Reviews Neuroscience. 2016. Vol. 17. P. 450–461
2016
-
[11]
Practical measures of integrated information for time-series data // PLoS Compu- tational Biology
Barrett A.B., Seth A.K. Practical measures of integrated information for time-series data // PLoS Compu- tational Biology. 2011. Vol. 7(1). e1001052
2011
-
[12]
The Bulletin of Mathematical Biophysics5(4), 115–133 (1943)
McCulloch W.S., Pitts W. A logical calculus of the ideas immanent in nervous activity // The Bulletin of Mathematical Biophysics. 1943. Vol. 5, no. 4. P. 115–133. DOI: 10.1007/BF02478259
-
[13]
Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain // Psychological Review. 1958. Vol. 65, no. 6. P. 386–408. DOI: 10.1037/h0042519
-
[14]
Perceptrons: An Introduction to Computational Geometry
Minsky M., Papert S. Perceptrons: An Introduction to Computational Geometry. Cambridge, MA: MIT Press, 1969
1969
-
[15]
Rumelhart D.E., Hinton G.E., Williams R.J. Learning representations by back-propagating errors // Nature. 1986. Vol. 323, no. 6088. P. 533–536. DOI: 10.1038/323533a0
-
[16]
doi: 10.1162/neco.1997.9.8.1735
Hochreiter S., Schmidhuber J. Long short-term memory // Neural Computation. 1997. Vol. 9, no. 8. P. 1735–1780. DOI: 10.1162/neco.1997.9.8.1735
-
[17]
The main riddle of nature: how subjective experiences arise from brain activity // Psikhologicheskii Zhurnal
Ivanitsky A.M. The main riddle of nature: how subjective experiences arise from brain activity // Psikhologicheskii Zhurnal. 1997. Vol. 18, no. 3. P. 13–24
1997
-
[18]
LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition // Proceedings of the IEEE. 1998. Vol. 86, no. 11. P. 2278–2324. DOI: 10.1109/5.726791
-
[19]
A Universe of Consciousness: How Matter Becomes Imagination
Edelman G.M., Tononi G. A Universe of Consciousness: How Matter Becomes Imagination. New York: Basic Books, 2000
2000
-
[20]
KrizhevskyA.,SutskeverI.,HintonG.E.ImageNetclassificationwithdeepconvolutionalneuralnetworks // Advances in Neural Information Processing Systems (NeurIPS). 2012. Vol. 25. P. 1097–1105. DOI: 10.1145/3065386
-
[21]
Auto-encoding variational Bayes // arXiv preprint, 2013
Kingma D.P., Welling M. Auto-encoding variational Bayes // arXiv preprint, 2013. arXiv:1312.6114
Pith/arXiv arXiv 2013
-
[22]
Cho K., Van Merriënboer B., Gulcehre C. et al. Learning phrase representations using RNN encoder– decoder for statistical machine translation // arXiv preprint, 2014. arXiv:1406.1078
Pith/arXiv arXiv 2014
-
[23]
Goodfellow I., Pouget-Abadie J., Mirza M. et al. Generative adversarial nets // Advances in Neural Information Processing Systems (NeurIPS). 2014. Vol. 27. P. 2672–2680. arXiv:1406.2661. 25
Pith/arXiv arXiv 2014
-
[24]
KipfT.N.,WellingM.Semi-supervisedclassificationwithgraphconvolutionalnetworks//arXivpreprint,
-
[25]
Vaswani A., Shazeer N., Parmar N. et al. Attention is all you need // Advances in Neural Information Processing Systems (NeurIPS). 2017. Vol. 30. P. 5998–6008. arXiv:1706.03762
Pith/arXiv arXiv 2017
-
[26]
Shazeer N., Mirhoseini A., Maziarz K. et al. Outrageously large neural networks: the sparsely-gated mixture-of-experts layer // arXiv preprint, 2017. arXiv:1701.06538
Pith/arXiv arXiv 2017
-
[27]
Superintelligence: Paths, Dangers, Strategies
Bostrom N. Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014. 26
2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.