arxiv: 2604.23278 · v1 · submitted 2026-04-25 · 💻 cs.AI

Recognition: unknown

Active Inference: A method for Phenotyping Agency in AI systems?

Philip Wilson , Axel Constant , Mahault Albarracin , Nicol\'as Hinrichs , Jasmine Moore , Daniel Polani , Karl Friston

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:01 UTC · model grok-4.3

classification 💻 cs.AI

keywords active inferenceagency phenotypingempowermentPOMDPvariational inferenceAI governanceintentionalityrationality

0 comments

The pith

Empowerment, measured as the information capacity between actions and anticipated observations, distinguishes zero-, intermediate-, and high-agency AI phenotypes by altering the structure of their generative models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper advances a minimal definition of agency grounded in three criteria: intentionality as action based on beliefs and desires, rationality as normatively coherent action from a world model, and explainability as action traceable to internal states. These are instantiated in a partially observable Markov decision process within a variational active inference framework, where posterior beliefs, prior preferences, and expected free energy minimization form the agentic action chain. Using a T-maze example, the authors show that empowerment serves as a concrete metric to separate different agency levels through targeted changes to the agent's internal model. Readers would care because agentic AI is advancing faster than tools to inspect and manage its different forms, and this approach offers a variational link from phenotyping to governance.

Core claim

Agency in AI systems can be phenotyped by modeling them as POMDPs under active inference, where the three criteria are realized through beliefs, preferences, and free energy minimization, and where empowerment as the channel capacity between actions and anticipated observations distinguishes zero-agency, intermediate-agency, and high-agency phenotypes via structural manipulations of the generative model.

What carries the argument

Empowerment as the channel capacity between actions and anticipated observations, which functions as the operational metric that separates agency phenotypes when the generative model is structurally altered in a T-maze paradigm.

If this is right

Structural manipulations of the generative model can produce agents with zero, intermediate, or high agency as quantified by empowerment.
As agents engage in epistemic foraging, effective governance must shift from external constraints to internal modulation of prior preferences.
The variational active inference setup supplies a direct bridge from computational phenotyping of agency to concrete AI governance strategies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same empowerment metric could be applied to compare agency profiles across different AI architectures beyond active inference.
This phenotyping method suggests testable experiments in more complex environments to see whether agency levels remain distinguishable.
Regulators might use internal preference modulation as a targeted control mechanism once agents reach higher agency phenotypes.

Load-bearing premise

The three criteria of intentionality, rationality, and explainability can be fully and minimally realized as a POMDP under a variational framework without further unstated assumptions about the generative model or the definition of empowerment.

What would settle it

A replication of the T-maze experiments in which structural changes to the generative model produce no distinguishable differences in empowerment values across the three claimed agency phenotypes, or in which the modeled action chain fails to satisfy the three criteria.

Figures

Figures reproduced from arXiv: 2604.23278 by Axel Constant, Daniel Polani, Jasmine Moore, Karl Friston, Mahault Albarracin, Nicol\'as Hinrichs, Philip Wilson.

**Figure 1.** Figure 1: Relation between rationality and intentionality in active inference world models. Beliefs and desires (blue boxes) are combined to select action in a rational way (green arrow) according to principles of economics, based on evidence, or observed data. In a recent extensive review, Ding and colleagues [14] provide a categorization of world models along two dimensions: (i) world models designed to “construct… view at source ↗

**Figure 2.** Figure 2: The T-maze task. Shown are the two possible initializations of the environment. The arrow symbolizes the observation Right or Left and points to where the cheese is. The task is ideal for phenotyping agency because it contains uncertainty that can be resolved through epistemic action (Cue). It contains suboptimal actions (Left/Right at t1) that reduce empowerment and thereby express lower agency. It also c… view at source ↗

**Figure 3.** Figure 3: Transition distribution p(o2 | a1) of the first time step. Left and Right actions produce indistinguishable observation distributions so can be thought of as one action. If the resulting observations for two actions are the same, they are effectively the same action. That is why the empowerment in the first time step is not maximal. It effectively has 2 out of the possible 3 actions at its disposal. Baseli… view at source ↗

**Figure 4.** Figure 4: If going Right and seeing a cheese, a cheese is all it will ever see. Same goes for shock, or going Left first. All actions lead to the same observation, so empowerment is 0 bits: effectively one action, log2 1 = 0. This is just what a trap means, a low-empowerment state. The symmetric insight is that empowerment is itself necessary to gain information: gaining information increases empowerment, but empowe… view at source ↗

**Figure 5.** Figure 5: The mapping from actions to hidden states p(s2 | a1) shows how Right and Left actions lead to one hidden state. The Cue leads to two different hidden states, that is, the agent is uncertain about them. going Left or Right. Therefore there is some information to be gained, which implies hidden states uncertainty reduction. In view at source ↗

**Figure 6.** Figure 6: Actions Right and Left lead to State 1. State 1 is an equal superposition between observations Cheese and Shock. It is a truly random variable, or an uncorrelated coin toss. The environment is initialized with this random sampling and the agent does not know any better if it just gets trapped there. The Cue leads to States 2 and 3. These in turn, uniquely lead to the observation Right or Left. In other wor… view at source ↗

**Figure 7.** Figure 7: After going to the Cue, all actions lead to unique states. After seeing Right, the agent knows that is where the cheese is. Agency phenotyping: We have shown how the empowerment at different stages in the environment reflect a different degree of agency. An optimal Active Inference agent will increase its empowerment after going to the Cue (if it has a preference for seeing cheese). If it prefers seeing th… view at source ↗

read the original abstract

The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-directedness. Here, we argue for a minimal notion open to principled inspection given three criteria: intentionality as action grounded in beliefs and desires, rationality as normatively coherent action entailed by a world model, and explainability as action causally traceable to internal states; we subsequently instantiate these as a partially observable Markov decision process under a variational framework wherein posterior beliefs, prior preferences, and the minimization of expected free energy jointly constitute an agentic action chain. Using a canonical T-maze paradigm, we evidence how empowerment, formulated as the channel capacity between actions and anticipated observations, serves as an operational metric that distinguishes zero-, intermediate-, and high-agency phenotypes through structural manipulations of the generative model. We conclude by arguing that as agents engage in epistemic foraging to resolve ambiguity, the governance controls that remain effective must shift systematically from external constraints to the internal modulation of prior preferences, offering a principled, variational bridge from computational phenotyping to AI governance strategy

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper applies the active inference framework to define agency phenotypes via empowerment in a T-maze, but the mapping from the three criteria to model manipulations remains under-specified.

read the letter

The core contribution is taking the group's established active inference setup and using it to label agents as zero-, intermediate-, or high-agency based on how structural changes to the generative model affect empowerment, defined as channel capacity between actions and anticipated observations. They instantiate intentionality, rationality, and explainability as a POMDP with variational free-energy minimization, then run the T-maze to show the metric separates the phenotypes. The closing move to AI governance, where effective controls shift from external rules to internal prior preferences as agents do epistemic foraging, is a direct extension of that framing.

Referee Report

2 major / 2 minor

Summary. The paper proposes a minimal definition of agency in AI systems based on three criteria—intentionality (actions grounded in beliefs and desires), rationality (normatively coherent actions from a world model), and explainability (actions causally traceable to internal states). These are instantiated as a POMDP under the active inference variational framework, where posterior beliefs, prior preferences, and expected free energy minimization form an 'agentic action chain.' Using structural manipulations of the generative model in a canonical T-maze paradigm, the authors claim that empowerment—defined as the channel capacity (mutual information) between actions and anticipated observations—serves as an operational metric to distinguish zero-, intermediate-, and high-agency phenotypes. The work concludes with implications for AI governance, suggesting a shift from external constraints to internal modulation of prior preferences as agents engage in epistemic foraging.

Significance. If the central mapping holds, this provides a computationally grounded, variational approach to phenotyping agency that could bridge theoretical definitions with practical metrics and governance strategies in AI. Strengths include the use of an established framework (active inference) with potential for reproducible simulations and falsifiable distinctions in controlled environments like the T-maze; it offers a principled way to operationalize abstract criteria without relying solely on autonomy or goal-directedness.

major comments (2)

[Abstract and §3] Abstract and §3 (T-maze paradigm): The central claim that structural manipulations of the generative model (e.g., to transition or observation matrices) instantiate the three criteria and are then distinguished by empowerment I(A;O) is asserted without explicit construction. No equations are provided showing how, for instance, changes to prior preferences directly encode intentionality or how the variational posterior ensures explainability, making it unclear whether the manipulations are derived from the criteria or chosen post-hoc to yield different channel capacities. This is load-bearing for the operational metric claim.
[§2] §2 (agentic action chain definition): The instantiation of the three criteria as a POMDP under expected free energy minimization appears to inherit the circularity of the active inference framework itself, as posterior beliefs and prior preferences are defined in terms of the same variational quantities used to compute empowerment. Without an independent derivation or ablation showing that the channel capacity metric minimally captures intentionality/rationality/explainability (rather than being entailed by construction), the distinction between phenotypes risks being non-falsifiable.

minor comments (2)

[Abstract] Abstract: The phrase 'we evidence how' is imprecise for a conceptual/simulation-based claim; replace with 'we demonstrate' or 'we illustrate' if no statistical tests or error bars are reported.
[§2] Notation: Empowerment is described as 'channel capacity between actions and anticipated observations' but the precise formulation (e.g., whether it uses the expected free energy or a separate mutual information calculation) is not clarified in the provided abstract; add an explicit equation in §2 or §3.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their constructive feedback, which highlights important areas for strengthening the formal links between our theoretical criteria and the computational model. We respond to each major comment in turn, indicating the revisions we will undertake.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (T-maze paradigm): The central claim that structural manipulations of the generative model (e.g., to transition or observation matrices) instantiate the three criteria and are then distinguished by empowerment I(A;O) is asserted without explicit construction. No equations are provided showing how, for instance, changes to prior preferences directly encode intentionality or how the variational posterior ensures explainability, making it unclear whether the manipulations are derived from the criteria or chosen post-hoc to yield different channel capacities. This is load-bearing for the operational metric claim.

Authors: We accept that the manuscript does not currently provide explicit equations demonstrating how the structural manipulations derive from the three criteria. This is a valid point, and we will revise §3 to include a detailed mapping. Specifically, we will add equations showing that intentionality is operationalized by setting the prior preferences (C matrix) to encode desired states based on the agent's beliefs, rationality by the use of the transition model (B matrix) in expected free energy calculation, and explainability by the traceability through the variational posterior (Q(s)) to the action selection. We will also include a table or diagram clarifying that the manipulations (e.g., altering the observation matrix A to simulate different levels of perceptual accuracy) are chosen to instantiate varying degrees of these criteria, rather than arbitrarily to produce different empowerment values. This will make the operational metric claim more transparent and falsifiable. revision: yes
Referee: [§2] §2 (agentic action chain definition): The instantiation of the three criteria as a POMDP under expected free energy minimization appears to inherit the circularity of the active inference framework itself, as posterior beliefs and prior preferences are defined in terms of the same variational quantities used to compute empowerment. Without an independent derivation or ablation showing that the channel capacity metric minimally captures intentionality/rationality/explainability (rather than being entailed by construction), the distinction between phenotypes risks being non-falsifiable.

Authors: Regarding the potential circularity, we note that while the agentic action chain is defined within the active inference framework, the empowerment metric itself—I(A;O), the mutual information between actions and observations—is a general information-theoretic measure that does not depend on the variational inference procedure per se. It can be computed from the generative model and policy. To strengthen this, we will add in the revision an analysis or simulation showing the metric's behavior under different inference schemes or by comparing to a non-variational baseline. This addresses the concern about non-falsifiability by providing an independent check on whether the phenotype distinctions hold beyond the specific variational quantities. We disagree that it is necessarily circular, as the criteria are conceptually prior and the framework is used as a tool to instantiate them. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the paper's derivation chain

full rationale

The paper proposes instantiating the three agency criteria as a POMDP under the variational framework, with the agentic action chain constituted by posterior beliefs, prior preferences, and expected free energy minimization; this is a modeling choice for phenotyping rather than a derivation reducing results to inputs by construction. Empowerment is formulated using the standard information-theoretic definition of channel capacity between actions and anticipated observations, then applied to distinguish phenotypes via explicit structural manipulations of the generative model in T-maze simulations. No load-bearing self-citations, fitted inputs renamed as predictions, or self-definitional reductions appear in the abstract or described chain; the central claim rests on the proposed mapping and simulation evidence, which remains independent of the target metric.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based solely on abstract; full derivations and model specifications unavailable. The framework rests on the active inference variational free energy principle and the definition of empowerment as channel capacity.

axioms (1)

domain assumption Agency can be minimally defined by intentionality, rationality, and explainability instantiated in a POMDP with expected free energy minimization
Stated as the central instantiation in the abstract.

pith-pipeline@v0.9.0 · 5506 in / 1399 out tokens · 20157 ms · 2026-05-08T08:01:36.889060+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 25 canonical work pages · 1 internal anchor

[1]

The MIT Press, Cambridge, Massachusetts (2022)

Parr, T., Pezzulo, G., Friston, K.J.: Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. The MIT Press, Cambridge, Massachusetts (2022)

2022
[2]

Trends in Cognitive Sciences (2018) https://doi.org/10.1016/j.tics

Montague, P.R., Dolan, R.J., Friston, K.J., Dayan, P.: Computational psychiatry. Trends in Cognitive Sciences16(1), 72–80 (2012). https://doi.org/10.1016/j.tics. 2011.11.018

work page doi:10.1016/j.tics 2012
[3]

eNeuro3(4), ENEURO.0049-16.2016 (2016)

Schwartenbeck, P., Friston, K.: Computational Phenotyping in Psychiatry: A Worked Example. eNeuro3(4), ENEURO.0049-16.2016 (2016). https://doi.org/ 10.1523/ENEURO.0049-16.2016

work page doi:10.1523/eneuro.0049-16.2016 2016
[4]

The Knowledge Engineering Review10(2), 115–152 (1995)

Wooldridge, M., Jennings, N.R.: Intelligent Agents: Theory and Practice. The Knowledge Engineering Review10(2), 115–152 (1995). https://doi.org/10.1017/ S0269888900008122

1995
[5]

Jennings, Katia P

Jennings, N.R., Sycara, K., Wooldridge, M.: A Roadmap of Agent Research and Development. Autonomous Agents and Multi-Agent Systems1(1), 7–38 (1998). https://doi.org/10.1023/A:1010090405266

work page doi:10.1023/a:1010090405266 1998
[6]

The Computer Journal44(1), 1–20 (2001)

Luck, M., d’Inverno, M.: A Conceptual Framework for Agent Definition and De- velopment. The Computer Journal44(1), 1–20 (2001). https://doi.org/10.1093/ comjnl/44.1.1

2001
[7]

In: Proc

Shavit, Y., Agarwal, S., Brundage, M., et al.: Practices for governing agentic ai systems. In: Proc. Res. Paper, pp. 1–25. OpenAI (2023)

2023
[8]

arXiv preprint arXiv:2412.08862 (2024)

Vyas, V., Xu, Z.: Key Safety Design Overview in AI-driven Autonomous Vehicles. arXiv preprint arXiv:2412.08862 (2024). https://arxiv.org/abs/2412.08862

work page arXiv 2024
[9]

In: ASME 2004 International Mechanical Engineering Congress and Exposition, pp

Huang, H.-M., Messina, E., Wade, R., English, R., Novak, B., Albus, J.: Auton- omy Measures for Robots. In: ASME 2004 International Mechanical Engineering Congress and Exposition, pp. 1241–1247. American Society of Mechanical Engi- neers Digital Collection (2008). https://doi.org/10.1115/IMECE2004-61812

work page doi:10.1115/imece2004-61812 2004
[10]

IEEE Technology and Society Magazine39(3), 13–19 (2020)

Stayton, E., Stilgoe, J.: It’s Time to Rethink Levels of Automation for Self-Driving Vehicles [Opinion]. IEEE Technology and Society Magazine39(3), 13–19 (2020). https://doi.org/10.1109/MTS.2020.3012315

work page doi:10.1109/mts.2020.3012315 2020
[11]

NIST (2005)

Huang, H.-M., Pavek, K., Novak, B., Albus, J.S., Messina, E.R.: A Framework For Autonomy Levels For Unmanned Systems (ALFUS). NIST (2005)

2005
[12]

In: International Encyclopedia of Ethics, pp

Oshana, M.: Relational Autonomy. In: International Encyclopedia of Ethics, pp. 1–13. John Wiley & Sons, Ltd (2020). https://doi.org/10.1002/9781444367072. wbiee921

work page doi:10.1002/9781444367072 2020
[13]

In: The Routledge Companion to Feminist Philosophy

Mackenzie, C.: Feminist Conceptions of Autonomy. In: The Routledge Companion to Feminist Philosophy. Routledge (2017)

2017
[14]

Understanding world or predicting future? A comprehensive survey of world models.arXiv preprint arXiv:2411.14499, 2024

Ding,J.,Zhang,Y.,Shang,Y.,etal.:UnderstandingWorldorPredictingFuture?A Comprehensive Survey of World Models. arXiv preprint arXiv:2411.14499 (2024). https://doi.org/10.48550/ARXIV.2411.14499

work page doi:10.48550/arxiv.2411.14499 2024
[15]

Journal of Mathematical Psychology 99, 102447 (2020)

Da Costa, L., Parr, T., Sajid, N., Veselic, S., Neacsu, V., Friston, K.: Active Infer- ence on Discrete State-Spaces: A Synthesis. Journal of Mathematical Psychology 99, 102447 (2020). https://doi.org/10.1016/j.jmp.2020.102447 Active Inference for Phenotyping AI Agency 17

work page doi:10.1016/j.jmp.2020.102447 2020
[16]

CoRRabs/2002.12636(2020)

Tschantz, A., Millidge, B., Seth, A.K., Buckley, C.L.: Reinforcement Learning through Active Inference. CoRRabs/2002.12636(2020). https://arxiv.org/abs/ 2002.12636

work page arXiv 2002
[17]

& Yang, Z

Friston, K.J., Daunizeau, J., Kiebel, S.J.: Reinforcement Learning or Active In- ference? PLOS ONE4(7), e6421 (2009). https://doi.org/10.1371/journal.pone. 0006421

work page doi:10.1371/journal.pone 2009
[18]

Neural Computation35(5), 807–852 (2023)

Da Costa, L., Sajid, N., Parr, T., Friston, K., Smith, R.: Reward Maximization Through Discrete Active Inference. Neural Computation35(5), 807–852 (2023). https://doi.org/10.1162/neco_a_01574

work page doi:10.1162/neco_a_01574 2023
[19]

arXiv preprint arXiv:2302.00111 , year=

Du, Y., Yang, M., Dai, B., et al.: Learning Universal Policies via Text-Guided Video Generation. arXiv preprint arXiv:2302.00111 (2023). https://doi.org/10. 48550/arXiv.2302.00111

work page arXiv 2023
[20]

TD-MPC2: Scalable, Robust World Models for Continuous Control

Hansen, N., Su, H., Wang, X.: TD-MPC2: Scalable, Robust World Models for Continuous Control. arXiv preprint arXiv:2310.16828 (2024). https://doi.org/10. 48550/arXiv.2310.16828

work page internal anchor Pith review arXiv 2024
[21]

The British Journal for the Philosophy of Science (2024)

Junker, F.T., Bruineberg, J., Grünbaum, T.: Predictive Minds Can Be Humean Minds. The British Journal for the Philosophy of Science (2024). https://doi.org/ 10.1086/733413

work page doi:10.1086/733413 2024
[22]

Frontiers in Psychology12, 585493 (2021)

Albarracin, M., Constant, A., Friston, K.J., Ramstead, M.J.D.: A Variational Ap- proach to Scripts. Frontiers in Psychology12, 585493 (2021). https://doi.org/10. 3389/fpsyg.2021.585493

work page arXiv 2021
[23]

Empowerment: a uni- versal agent-centric measure of control,

A. S. Klyubin, D. Polani and C. L. Nehaniv, "Empowerment: a uni- versal agent-centric measure of control," 2005 IEEE Congress on Evo- lutionary Computation, Edinburgh, UK, 2005, pp. 128-135 Vol.1, doi: 10.1109/CEC.2005.1554676. keywords: Animals;Organisms;Evolution (biol- ogy);Adaptive systems;Humans;Feedback;Computer science;Educational institu- tions;Ac...

work page doi:10.1109/cec.2005.1554676 2005
[24]

Schwartenbeck, P., Johannes Passecker, Tobias U Hauser, Thomas HB FitzGerald, Martin Kronbichler, Karl J Friston (2019) Computational mechanisms of curiosity and goal-directed exploration https://doi.org/10.7554/eLife.41703

work page doi:10.7554/elife.41703 2019
[25]

Cortex68, 129–143 (2015)

Friston, K.J., Frith, C.D.: Active Inference, Communication and Hermeneutics. Cortex68, 129–143 (2015). https://doi.org/10.1016/j.cortex.2015.03.025

work page doi:10.1016/j.cortex.2015.03.025 2015
[26]

Frontiers in Human Neuroscience7, 547 (2013)

Limanowski, J., Blankenburg, F.: Minimal Self-Models and the Free Energy Prin- ciple. Frontiers in Human Neuroscience7, 547 (2013). https://doi.org/10.3389/ fnhum.2013.00547

work page arXiv 2013
[27]

Frontiers in Psychology 11, 417 (2020)

Vasil, J., Badcock, P.B., Constant, A., Friston, K., Ramstead, M.J.D.: A World Unto Itself: Human Communication as Active Inference. Frontiers in Psychology 11, 417 (2020). https://doi.org/10.3389/fpsyg.2020.00417

work page doi:10.3389/fpsyg.2020.00417 2020
[28]

The Behavioral and Brain Sciences43, e90 (2019)

Veissière, S.P.L., Constant, A., Ramstead, M.J.D., Friston, K.J., Kirmayer, L.J.: Thinking through Other Minds: A Variational Approach to Cognition and Cul- ture. The Behavioral and Brain Sciences43, e90 (2019). https://doi.org/10.1017/ S0140525X19001213

2019
[29]

Cognitive Neuroscience (2015)

Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., Pezzulo, G.: Active in- ference and epistemic value. Cognitive Neuroscience (2015)

2015
[30]

Clarendon Press, Oxford (1980)

Davidson, D.: Essays on Actions and Events. Clarendon Press, Oxford (1980)

1980
[31]

Mind96(381), 36–61 (1987)

Smith, M.: The Humean Theory of Motivation. Mind96(381), 36–61 (1987). https: //doi.org/10.1093/mind/XCVI.381.36

work page doi:10.1093/mind/xcvi.381.36 1987
[32]

Minds and Ma- chines28(1), 141–172 (2018)

Williams, D.: Predictive Processing and the Representation Wars. Minds and Ma- chines28(1), 141–172 (2018). https://doi.org/10.1007/s11023-017-9441-6 18 P. Wilson et al

work page doi:10.1007/s11023-017-9441-6 2018
[33]

Transformer Circuits Thread, Anthropic (2023)

Bricken, T., Templeton, A., Batson, J., et al.: Towards Monosemanticity: Decom- posing Language Models With Dictionary Learning. Transformer Circuits Thread, Anthropic (2023)

2023
[34]

Queue16(3), 31–57 (2018)

Lipton, Z.C.: The Mythos of Model Interpretability. Queue16(3), 31–57 (2018). https://doi.org/10.1145/3236386.3241340

work page doi:10.1145/3236386.3241340 2018
[35]

OpenReview preprint (2022)

LeCun, Y.: A Path Towards Autonomous Machine Intelligence. OpenReview preprint (2022). https://openreview.net/pdf?id=BZ5a1r-kVsf

2022
[36]

The Batch, DeepLearning.AI (2024)

Ng, A.: Agentic Design Patterns (Parts 1–5). The Batch, DeepLearning.AI (2024). https://www.deeplearning.ai/the-batch/

2024
[37]

Wiley, New York (1957)

Simon, H.A.: Models of Man: Social and Rational. Wiley, New York (1957)

1957
[38]

Proceedings of the Royal Society A469(2153), 20120683 (2013)

Ortega, P.A., Braun, D.A.: Thermodynamics as a Theory of Decision-Making with Information-Processing Costs. Proceedings of the Royal Society A469(2153), 20120683 (2013). https://doi.org/10.1098/rspa.2012.0683

work page doi:10.1098/rspa.2012.0683 2013
[39]

In: Advances in Neural Information Pro- cessing Systems, vol

Mohamed, S., Rezende, D.J.: Variational Information Maximisation for Intrinsi- cally Motivated Reinforcement Learning. In: Advances in Neural Information Pro- cessing Systems, vol. 28. Curran Associates, Inc. (2015)

2015