Principles of frugal inference and control

Itzel Olivos-Castillo; Paul Schrater; Xaq Pitkow

arxiv: 2406.14427 · v4 · pith:ZBUM5DCZnew · submitted 2024-06-20 · 💻 cs.AI · q-bio.NC

Principles of frugal inference and control

Itzel Olivos-Castillo , Paul Schrater , Xaq Pitkow This is my paper

Pith reviewed 2026-05-24 00:17 UTC · model grok-4.3

classification 💻 cs.AI q-bio.NC

keywords resource-constrained POMDPfrugal inferenceinformation costlossy estimationcompensatory controllinear-Gaussian approximationpole balancingdrone stabilization

0 comments

The pith

When inference carries a cost, optimal control shifts to lossy estimation, multiple equivalent solution pairs, and actions that lower future representation costs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper modifies the POMDP framework so that the cost of acquiring information through inference is traded directly against task utility. Solving the resulting problem in its local linear-Gaussian form produces three explicit principles for resource-efficient behavior. First, agents deliberately leave some uncertainty unresolved rather than performing lossless Bayesian updates. Second, imperfect inference can be paired with compensatory control actions in many different ways that still reach the same performance level. Third, control itself can be used to drive the system into states where future inference is cheaper. The authors show these rules remain useful when the approximation is dropped and the same controller is applied to nonlinear tasks such as pole balancing and drone stabilization.

Core claim

Treating information as a resource that must be budgeted inside a POMDP yields three general principles. Inference moves from exact Bayesian compression to a lossy regime that strategically tolerates unresolved uncertainty. This relaxation produces a manifold of equivalent inference-control pairs that achieve identical task performance. Control actions can additionally be chosen to reduce estimation errors and steer the dynamics into regions where representation cost is lower. These principles, first derived under a local linear-Gaussian approximation, continue to produce effective controllers for nonlinear problems such as pole balancing and drone stabilization.

What carries the argument

The resource-constrained POMDP in which inference cost is optimized jointly with expected utility, solved via its local linear-Gaussian approximation.

If this is right

Agents perform lossy rather than Bayes-optimal inference when information is costly, deliberately leaving some uncertainty unresolved.
A manifold of inference-control pairs exists that all attain the same utility, allowing additional constraints to be met without performance loss.
Control actions can be selected to reduce estimation error and move the system into lower-cost representation regimes.
The same principles produce working controllers for nonlinear tasks once the linear-Gaussian solution is transferred.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework supplies a concrete way to trade off computation against performance in any sequential decision problem whose state must be estimated.
It suggests that observed variability in biological behavior may reflect different points on the same manifold of equivalent solutions rather than noise.
The approach could be tested by measuring whether real agents reduce control effort in regimes where sensory noise is known to be lower.

Load-bearing premise

The behavior extracted from the local linear-Gaussian case remains representative once the same controller is applied to the original nonlinear dynamics.

What would settle it

If a controller built from the three principles fails to achieve comparable task performance or higher resource efficiency than a standard POMDP controller on pole balancing or drone stabilization, the claimed generalization would be refuted.

Figures

Figures reproduced from arXiv: 2406.14427 by Itzel Olivos-Castillo, Paul Schrater, Xaq Pitkow.

**Figure 1.** Figure 1: Landscape of the optimization problem. Without resource constraints, there is only one strategy {Γ, Ψ} that minimizes total loss. However, when the cost of confidence matters, the optimization landscape changes significantly. For resource-constrained agents, a family of resourcesaving strategies can minimize total loss. These strategies are parameterized by a free orthogonal transformation (i.e., a rotati… view at source ↗

**Figure 2.** Figure 2: Different ways to interpret uncertainty. A) In easy tasks, modeling uncertainty as oscillations in a world with deterministic dynamics minimizes inference cost. In moderately difficult tasks, the noise is modeled as stronger oscillations in a stochastic world. Highly unstable worlds are fragile to model mismatch; the range of instability that can tolerate model mismatch changes according to task demands. B… view at source ↗

**Figure 3.** Figure 3: Evidence integration. Rational agents approximate inference based on world properties and the control objective. This leads to non-monotonical changes in the attention to new evidence (A). In multidimensional contexts, rational agents allocate their resources wisely by disregarding observations from stable directions and focusing on synthesizing optimal estimates in volatile directions where making mistake… view at source ↗

**Figure 4.** Figure 4: Moving more to think less. Rational agents apply stronger control gains compared to unconstrained agents. A higher control gain can either offset the errors resulting from suboptimal inference or make optimal beliefs affordable by reducing state variance (A). The differences in the movement trajectories of naive and skeptical agents can be explained by studying the level of surprise (the variance of the di… view at source ↗

**Figure 5.** Figure 5: Spiking Neural Network. Recursive Bayesian inference is implemented neurally using a dynamic Probabilistic Population Code: linear projections of spiking activity approximate the natural parameters of the likelihood and the belief. response variability encode observations yt. Next, neural activity r in feeds a recurrent layer whose firing activity, r out ∼ Poi(νt), encodes the belief bt = N (ˆxt, σ¯ 2 ) = … view at source ↗

read the original abstract

A central challenge for intelligent agents in an uncertain world is striking the right balance between utility maximization and resource use, not only for external movement but also for internal computation. Existing theories of control under uncertainty typically treat inference as cost-free, despite the substantial computational and energetic burden it imposes in both artificial and biological systems. To remedy this problem, we introduce a novel variant of the POMDP framework in which the information acquired through inference is treated as a resource that must be optimized alongside utility. Solving a local linear-Gaussian approximation of the resulting problem reveals three general principles of resource-efficient control. First, when information is costly, inference shifts from a Bayes-optimal (lossless) compression of the past to a lossy regime that strategically leaves some uncertainty unresolved to optimize resource use. Second, relaxing exact Bayesian inference creates a manifold of equivalent solutions, reflecting multiple ways to combine imperfect inference with compensatory control. This flexibility can be used to meet additional objectives or constraints without sacrificing performance on the original task. Third, beyond goal attainment, control can be leveraged to counteract estimation errors and steer the system into regimes where representation costs are lower. We empirically demonstrate that these principles generalize beyond the local linear-Gaussian approximation, enabling the solution of nonlinear control problems such as pole balancing and drone stabilization. Together, these results establish a framework for rational computation that extends existing approaches to information-constrained decision-making and offers normative insight into how brains and machines can achieve effective behavior under tight computational constraints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a POMDP that prices inference cost and extracts three principles from its linear-Gaussian approximation, but the evidence that those principles carry over to nonlinear tasks is not yet solid.

read the letter

The paper's main move is to add an explicit cost for the information gained through inference inside a POMDP, then solve a local linear-Gaussian version to obtain three principles: inference turns lossy, multiple imperfect inference-plus-control pairs become equivalent, and control can steer the system toward lower-cost representations. These are then tested on pole balancing and drone stabilization. The formulation itself is new relative to standard POMDPs that treat inference as free, and the three principles are a direct consequence of the joint objective rather than an added assumption. That framing is useful for anyone thinking about agents with limited internal resources. The soft spot is the generalization step. The abstract states that the principles hold beyond the approximation and enable the nonlinear examples, yet supplies no derivation of the approximation, no error bounds, and no experimental controls that isolate the three mechanisms. Without checks such as ablating lossy versus exact inference or verifying that actions are chosen specifically to reduce representation cost, it remains possible that the nonlinear tasks succeed for unrelated reasons. The stress-test concern therefore lands: the load-bearing claim rests on an assumption that is asserted but not demonstrated in the provided material. This work is aimed at researchers in computational neuroscience and resource-bounded AI who want a normative model for trading off computation against performance. A reader already interested in bounded-rationality POMDPs would get value from the setup even if the generalization needs more work. I would send it for peer review. The idea is coherent enough to merit referee time on the math and the experiments.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces a variant of the POMDP in which inference is treated as a costly resource to be optimized jointly with utility. Solving a local linear-Gaussian approximation of this resource-constrained POMDP is claimed to yield three principles (lossy inference, manifold of equivalent solutions, and control that steers the system into low-representation-cost regimes). These principles are asserted to generalize empirically beyond the approximation, enabling solution of nonlinear tasks such as pole balancing and drone stabilization.

Significance. If the derivation is sound and the generalization holds, the framework would supply normative principles for rational computation under explicit resource constraints, extending existing information-constrained decision-making approaches with potential relevance to both artificial agents and biological systems.

major comments (2)

[Abstract] Abstract: the central claim that the local linear-Gaussian approximation yields three general principles rests on an unshown derivation; no equations, optimization steps, or error analysis for the approximation are supplied, so it is impossible to verify whether the stated principles emerge directly from the resource-constrained objective or from additional modeling choices.
[Abstract] Abstract: the assertion that the three principles generalize to nonlinear problems is load-bearing for the paper's scope, yet the empirical demonstrations (pole balancing, drone stabilization) are described only at the level of task success; no indication is given that the experiments isolate the claimed mechanisms (e.g., by comparing lossy vs. lossless inference or by verifying that actions are chosen specifically to reduce representation cost).

minor comments (1)

[Abstract] Abstract: the phrase 'manifold of equivalent solutions' is introduced without a brief indication of its dimensionality or how it is parameterized, which would help readers assess the claimed flexibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the local linear-Gaussian approximation yields three general principles rests on an unshown derivation; no equations, optimization steps, or error analysis for the approximation are supplied, so it is impossible to verify whether the stated principles emerge directly from the resource-constrained objective or from additional modeling choices.

Authors: The full derivation of the three principles, including the resource-constrained objective, the local linear-Gaussian approximation, the optimization steps, and resulting equations, appears in Sections 3 and 4 of the manuscript. The abstract summarizes the outcome of that derivation, which is standard practice. We will revise the abstract to explicitly reference the sections containing the derivation and will add a short paragraph on approximation error bounds in the main text. revision: yes
Referee: [Abstract] Abstract: the assertion that the three principles generalize to nonlinear problems is load-bearing for the paper's scope, yet the empirical demonstrations (pole balancing, drone stabilization) are described only at the level of task success; no indication is given that the experiments isolate the claimed mechanisms (e.g., by comparing lossy vs. lossless inference or by verifying that actions are chosen specifically to reduce representation cost).

Authors: The current experiments demonstrate that the principles enable successful nonlinear control. We agree that stronger isolation of mechanisms is needed and will add, in the revision, (i) direct comparisons of lossy versus lossless inference on the same tasks and (ii) quantitative analysis showing that selected actions reduce representation cost. These controls will be reported alongside the existing task-success results. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation from linear-Gaussian POMDP solution is independent of target claims

full rationale

The paper defines a resource-constrained POMDP variant, solves its local linear-Gaussian approximation to obtain three explicit principles, and then provides separate empirical demonstrations on nonlinear tasks (pole balancing, drone stabilization) to support generalization. This chain contains no self-definitional steps, no fitted parameters renamed as predictions, and no load-bearing self-citations or imported uniqueness theorems. The central results are obtained by direct solution of the stated approximation rather than by reduction to prior inputs or ansatzes from the same authors.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard POMDP axioms plus the modeling choice that inference cost can be represented inside the same optimization as utility; no new invented entities are introduced. The linear-Gaussian approximation itself is an ad-hoc modeling assumption required to obtain closed-form principles.

axioms (2)

domain assumption POMDP transition and observation models are known or can be approximated locally as linear-Gaussian.
Invoked when the authors restrict analysis to a local linear-Gaussian approximation to derive the three principles.
domain assumption Inference cost can be quantified and traded off against expected utility inside a single objective.
Core modeling choice that defines the novel POMDP variant.

pith-pipeline@v0.9.0 · 5795 in / 1381 out tokens · 17968 ms · 2026-05-24T00:17:19.708789+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

min E[ cx x²t + cu u²t + cn (I(xt;yt) + I(xt;ˆxt)) ] (Eq. 4); family of solutions via free orthogonal transformation on (Γ,Ψ) (Appendix B)
IndisputableMonolith/Foundation/DimensionForcing.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

phase transitions between optimal, fully reactive, fully predictive and custom-fit inference regimes

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

Algorithms for decision making

Mykel J Kochenderfer, Tim A Wheeler, and Kyle H Wray. Algorithms for decision making. MIT press, 2022

work page 2022
[2]

Bayesian reinforce- ment learning: A survey

Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar, et al. Bayesian reinforce- ment learning: A survey. Foundations and Trends® in Machine Learning, 8(5-6):359–483, 2015

work page 2015
[3]

Partially observable markov decision processes and robotics

Hanna Kurniawati. Partially observable markov decision processes and robotics. Annual Review of Control, Robotics, and Autonomous Systems, 5:253–277, 2022

work page 2022
[4]

Neural networks with motivation

Sergey A Shuvaev, Ngoc B Tran, Marcus Stephenson-Jones, Bo Li, and Alexei A Koulakov. Neural networks with motivation. Frontiers in Systems Neuroscience, 14:609316, 2021

work page 2021
[5]

Bayesian reasoning and machine learning

David Barber. Bayesian reasoning and machine learning. Cambridge University Press, 2012. 10

work page 2012
[6]

The complexity of markov decision processes

Christos H Papadimitriou and John N Tsitsiklis. The complexity of markov decision processes. Mathematics of operations research, 12(3):441–450, 1987

work page 1987
[7]

The synergy between neuroscience and control theory: the nervous system as inspiration for hard control challenges

Manu S Madhav and Noah J Cowan. The synergy between neuroscience and control theory: the nervous system as inspiration for hard control challenges. Annual Review of Control, Robotics, and Autonomous Systems, 3:243–267, 2020

work page 2020
[8]

Neocortex saves energy by reducing coding precision during food scarcity

Zahid Padamsey, Danai Katsanevaki, Nathalie Dupuy, and Nathalie L Rochefort. Neocortex saves energy by reducing coding precision during food scarcity. Neuron, 110(2):280–296, 2022

work page 2022
[9]

Perception as Bayesian inference

David C Knill and Whitman Richards. Perception as Bayesian inference. Cambridge University Press, 1996

work page 1996
[10]

The bayesian brain: the role of uncertainty in neural coding and computation

David C Knill and Alexandre Pouget. The bayesian brain: the role of uncertainty in neural coding and computation. TRENDS in Neurosciences, 27(12):712–719, 2004

work page 2004
[11]

Hippocampal remapping as hidden state inference

Honi Sanders, Matthew A Wilson, and Samuel J Gershman. Hippocampal remapping as hidden state inference. Elife, 9:e51140, 2020

work page 2020
[12]

An energy budget for signaling in the grey matter of the brain

David Attwell and Simon B Laughlin. An energy budget for signaling in the grey matter of the brain. Journal of Cerebral Blood Flow & Metabolism, 21(10):1133–1145, 2001

work page 2001
[13]

Paying the brain’s energy bill

Zahid Padamsey and Nathalie L Rochefort. Paying the brain’s energy bill. Current Opinion in Neurobiology, 78:102668, 2023

work page 2023
[14]

One and done? optimal decisions from very few samples

Edward Vul, Noah Goodman, Thomas L Griffiths, and Joshua B Tenenbaum. One and done? optimal decisions from very few samples. Cognitive science, 38(4):599–637, 2014

work page 2014
[15]

Computational rationality: A converging paradigm for intelligence in brains, minds, and machines

Samuel J Gershman, Eric J Horvitz, and Joshua B Tenenbaum. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science, 349(6245):273– 278, 2015

work page 2015
[16]

Planning complexity registers as a cost in metacontrol

Wouter Kool, Samuel J Gershman, and Fiery A Cushman. Planning complexity registers as a cost in metacontrol. Journal of cognitive neuroscience, 30(10):1391–1404, 2018

work page 2018
[17]

Some informational aspects of visual perception

Fred Attneave. Some informational aspects of visual perception. Psychological review, 61(3):183, 1954

work page 1954
[18]

Possible principles underlying the transformation of sensory messages

Horace B Barlow et al. Possible principles underlying the transformation of sensory messages. Sensory communication, 1(01):217–233, 1961

work page 1961
[19]

A simple coding procedure enhances a neuron’s information capacity

Simon Laughlin. A simple coding procedure enhances a neuron’s information capacity. Zeitschrift für Naturforschung c, 36(9-10):910–912, 1981

work page 1981
[20]

Spatiotemporal contrast sensitivity of early vision

J Hans Van Hateren. Spatiotemporal contrast sensitivity of early vision. Vision research, 33(2):257–267, 1993

work page 1993
[21]

Decorrelation and efficient coding by retinal ganglion cells

Xaq Pitkow and Markus Meister. Decorrelation and efficient coding by retinal ganglion cells. Nature neuroscience, 15(4):628–635, 2012

work page 2012
[22]

A bayesian observer model constrained by efficient coding can explain’anti-bayesian’percepts.Nature neuroscience, 18(10):1509–1517, 2015

Xue-Xin Wei and Alan A Stocker. A bayesian observer model constrained by efficient coding can explain’anti-bayesian’percepts.Nature neuroscience, 18(10):1509–1517, 2015

work page 2015
[23]

Bayesian efficient coding

Il Memming Park and Jonathan W Pillow. Bayesian efficient coding. BioRxiv, page 178418, 2017

work page 2017
[24]

Rational inattention in mice

Nikola Grujic, Jeroen Brus, Denis Burdakov, and Rafael Polania. Rational inattention in mice. Science advances, 8(9):eabj8935, 2022

work page 2022
[25]

Optimal neural codes for control and estimation

Alex K Susemihl, Ron Meir, and Manfred Opper. Optimal neural codes for control and estimation. Advances in neural information processing systems, 27, 2014

work page 2014
[26]

Homo heuristicus: Why biased minds make better inferences

Gerd Gigerenzer and Henry Brighton. Homo heuristicus: Why biased minds make better inferences. Topics in cognitive science, 1(1):107–143, 2009

work page 2009
[27]

The complexity dividend: when sophisticated inference matters

G Tavoni, T Doi, C Pizzica, V Balasubramanian, and JI Gold. The complexity dividend: when sophisticated inference matters. biorxiv, 563346, 2019

work page 2019
[28]

Heuristics from bounded meta-learned inference

Marcel Binz, Samuel J Gershman, Eric Schulz, and Dominik Endres. Heuristics from bounded meta-learned inference. Psychological review, 2022

work page 2022
[29]

People construct simplified mental representations to plan

Mark K Ho, David Abel, Carlos G Correa, Michael L Littman, Jonathan D Cohen, and Thomas L Griffiths. People construct simplified mental representations to plan. Nature, 606(7912):129– 136, 2022. 11

work page 2022
[30]

Bertsekas

D. Bertsekas. A Course in Reinforcement Learning. Athena Scientific, 2023

work page 2023
[31]

Robust control under uncertainty via bounded ratio- nality and differential privacy

Vincent Pacelli and Anirudha Majumdar. Robust control under uncertainty via bounded ratio- nality and differential privacy. In 2022 International Conference on Robotics and Automation (ICRA), pages 3467–3474. IEEE, 2022

work page 2022
[32]

The neural costs of optimal control

Samuel Gershman and Robert Wilson. The neural costs of optimal control. Advances in neural information processing systems, 23, 2010

work page 2010
[33]

Bounded Planning in Passive POMDPs

Roy Fox and Naftali Tishby. Bounded planning in passive pomdps. arXiv preprint arXiv:1206.6405, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[34]

Information-Theoretic Bounded Rationality

Pedro A Ortega, Daniel A Braun, Justin Dyer, Kee-Eung Kim, and Naftali Tishby. Information- theoretic bounded rationality. arXiv preprint arXiv:1512.06789, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[35]

Adaptive coding for dynamic sensory inference

Wiktor F Młynarski and Ann M Hermundstad. Adaptive coding for dynamic sensory inference. eLife, 7:e32055, jul 2018

work page 2018
[36]

Theoretical perspectives on active sensing

Scott Cheng-Hsin Yang, Daniel M Wolpert, and Máté Lengyel. Theoretical perspectives on active sensing. Current Opinion in Behavioral Sciences, 11:100–108, 2016. Computational modeling

work page 2016
[37]

Horvitz and Matthew Barry

Eric J. Horvitz and Matthew Barry. Display of information for time-critical decision making, 2013

work page 2013
[38]

Path integral control and bounded rationality

Daniel A Braun, Pedro A Ortega, Evangelos Theodorou, and Stefan Schaal. Path integral control and bounded rationality. In 2011 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), pages 202–209. IEEE, 2011

work page 2011
[39]

Reinforcement learning and control as probabilistic inference: Tutorial and review, 2018

Sergey Levine. Reinforcement learning and control as probabilistic inference: Tutorial and review, 2018

work page 2018
[40]

Bayesian inference with probabilistic population codes

Wei Ji Ma, Jeffrey M Beck, Peter E Latham, and Alexandre Pouget. Bayesian inference with probabilistic population codes. Nature neuroscience, 9(11):1432–1438, 2006

work page 2006
[41]

Interpreting neural response variability as monte carlo sampling of the posterior

Patrik Hoyer and Aapo Hyvärinen. Interpreting neural response variability as monte carlo sampling of the posterior. Advances in neural information processing systems, 15, 2002

work page 2002
[42]

Statistically optimal perception and learning: from behavior to neural representations

József Fiser, Pietro Berkes, Gerg˝o Orbán, and Máté Lengyel. Statistically optimal perception and learning: from behavior to neural representations. Trends in cognitive sciences, 14(3):119–130, 2010

work page 2010
[43]

Nonlinear bayesian filtering and learning: a neuronal dynamics for perception.Scientific reports, 7(1):8722, 2017

Anna Kutschireiter, Simone Carlo Surace, Henning Sprekeler, and Jean-Pascal Pfister. Nonlinear bayesian filtering and learning: a neuronal dynamics for perception.Scientific reports, 7(1):8722, 2017

work page 2017
[44]

Dynamic programming and optimal control: Volume I, volume 4

Dimitri Bertsekas. Dynamic programming and optimal control: Volume I, volume 4. Athena scientific, 2012

work page 2012
[45]

Marginalization in neural circuits with divisive normalization

Jeffrey M Beck, Peter E Latham, and Alexandre Pouget. Marginalization in neural circuits with divisive normalization. Journal of Neuroscience, 31(43):15310–15319, 2011. A The neural cost of decreasing uncertainty In this Appendix, we detail how we quantify the number of action potentials a spiking neural network uses to perform recursive Bayesian inferenc...

work page 2011
[46]

β tX i=0 αi yt−i − xt y0:t #2 = var(xt|y0:t) + E

is: νout t = ζ ∥ζ∥ 1 ˜r β + ξ ∥ξ∥ α (ξ · rout t−1) + ξ · rin t + c · 1 (8) Equation 8 holds for arbitrary vectors ξ and ζ as long as they are orthogonal to each other and to the vector 1. We use ξk = cos 2πk N /N and ζk = cos 4πk N /N, where N is the number of neurons in the recurrent layer. The term c · 1 in Equation 8 is an offset that ensures positive ...

work page

[1] [1]

Algorithms for decision making

Mykel J Kochenderfer, Tim A Wheeler, and Kyle H Wray. Algorithms for decision making. MIT press, 2022

work page 2022

[2] [2]

Bayesian reinforce- ment learning: A survey

Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar, et al. Bayesian reinforce- ment learning: A survey. Foundations and Trends® in Machine Learning, 8(5-6):359–483, 2015

work page 2015

[3] [3]

Partially observable markov decision processes and robotics

Hanna Kurniawati. Partially observable markov decision processes and robotics. Annual Review of Control, Robotics, and Autonomous Systems, 5:253–277, 2022

work page 2022

[4] [4]

Neural networks with motivation

Sergey A Shuvaev, Ngoc B Tran, Marcus Stephenson-Jones, Bo Li, and Alexei A Koulakov. Neural networks with motivation. Frontiers in Systems Neuroscience, 14:609316, 2021

work page 2021

[5] [5]

Bayesian reasoning and machine learning

David Barber. Bayesian reasoning and machine learning. Cambridge University Press, 2012. 10

work page 2012

[6] [6]

The complexity of markov decision processes

Christos H Papadimitriou and John N Tsitsiklis. The complexity of markov decision processes. Mathematics of operations research, 12(3):441–450, 1987

work page 1987

[7] [7]

The synergy between neuroscience and control theory: the nervous system as inspiration for hard control challenges

Manu S Madhav and Noah J Cowan. The synergy between neuroscience and control theory: the nervous system as inspiration for hard control challenges. Annual Review of Control, Robotics, and Autonomous Systems, 3:243–267, 2020

work page 2020

[8] [8]

Neocortex saves energy by reducing coding precision during food scarcity

Zahid Padamsey, Danai Katsanevaki, Nathalie Dupuy, and Nathalie L Rochefort. Neocortex saves energy by reducing coding precision during food scarcity. Neuron, 110(2):280–296, 2022

work page 2022

[9] [9]

Perception as Bayesian inference

David C Knill and Whitman Richards. Perception as Bayesian inference. Cambridge University Press, 1996

work page 1996

[10] [10]

The bayesian brain: the role of uncertainty in neural coding and computation

David C Knill and Alexandre Pouget. The bayesian brain: the role of uncertainty in neural coding and computation. TRENDS in Neurosciences, 27(12):712–719, 2004

work page 2004

[11] [11]

Hippocampal remapping as hidden state inference

Honi Sanders, Matthew A Wilson, and Samuel J Gershman. Hippocampal remapping as hidden state inference. Elife, 9:e51140, 2020

work page 2020

[12] [12]

An energy budget for signaling in the grey matter of the brain

David Attwell and Simon B Laughlin. An energy budget for signaling in the grey matter of the brain. Journal of Cerebral Blood Flow & Metabolism, 21(10):1133–1145, 2001

work page 2001

[13] [13]

Paying the brain’s energy bill

Zahid Padamsey and Nathalie L Rochefort. Paying the brain’s energy bill. Current Opinion in Neurobiology, 78:102668, 2023

work page 2023

[14] [14]

One and done? optimal decisions from very few samples

Edward Vul, Noah Goodman, Thomas L Griffiths, and Joshua B Tenenbaum. One and done? optimal decisions from very few samples. Cognitive science, 38(4):599–637, 2014

work page 2014

[15] [15]

Computational rationality: A converging paradigm for intelligence in brains, minds, and machines

Samuel J Gershman, Eric J Horvitz, and Joshua B Tenenbaum. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science, 349(6245):273– 278, 2015

work page 2015

[16] [16]

Planning complexity registers as a cost in metacontrol

Wouter Kool, Samuel J Gershman, and Fiery A Cushman. Planning complexity registers as a cost in metacontrol. Journal of cognitive neuroscience, 30(10):1391–1404, 2018

work page 2018

[17] [17]

Some informational aspects of visual perception

Fred Attneave. Some informational aspects of visual perception. Psychological review, 61(3):183, 1954

work page 1954

[18] [18]

Possible principles underlying the transformation of sensory messages

Horace B Barlow et al. Possible principles underlying the transformation of sensory messages. Sensory communication, 1(01):217–233, 1961

work page 1961

[19] [19]

A simple coding procedure enhances a neuron’s information capacity

Simon Laughlin. A simple coding procedure enhances a neuron’s information capacity. Zeitschrift für Naturforschung c, 36(9-10):910–912, 1981

work page 1981

[20] [20]

Spatiotemporal contrast sensitivity of early vision

J Hans Van Hateren. Spatiotemporal contrast sensitivity of early vision. Vision research, 33(2):257–267, 1993

work page 1993

[21] [21]

Decorrelation and efficient coding by retinal ganglion cells

Xaq Pitkow and Markus Meister. Decorrelation and efficient coding by retinal ganglion cells. Nature neuroscience, 15(4):628–635, 2012

work page 2012

[22] [22]

A bayesian observer model constrained by efficient coding can explain’anti-bayesian’percepts.Nature neuroscience, 18(10):1509–1517, 2015

Xue-Xin Wei and Alan A Stocker. A bayesian observer model constrained by efficient coding can explain’anti-bayesian’percepts.Nature neuroscience, 18(10):1509–1517, 2015

work page 2015

[23] [23]

Bayesian efficient coding

Il Memming Park and Jonathan W Pillow. Bayesian efficient coding. BioRxiv, page 178418, 2017

work page 2017

[24] [24]

Rational inattention in mice

Nikola Grujic, Jeroen Brus, Denis Burdakov, and Rafael Polania. Rational inattention in mice. Science advances, 8(9):eabj8935, 2022

work page 2022

[25] [25]

Optimal neural codes for control and estimation

Alex K Susemihl, Ron Meir, and Manfred Opper. Optimal neural codes for control and estimation. Advances in neural information processing systems, 27, 2014

work page 2014

[26] [26]

Homo heuristicus: Why biased minds make better inferences

Gerd Gigerenzer and Henry Brighton. Homo heuristicus: Why biased minds make better inferences. Topics in cognitive science, 1(1):107–143, 2009

work page 2009

[27] [27]

The complexity dividend: when sophisticated inference matters

G Tavoni, T Doi, C Pizzica, V Balasubramanian, and JI Gold. The complexity dividend: when sophisticated inference matters. biorxiv, 563346, 2019

work page 2019

[28] [28]

Heuristics from bounded meta-learned inference

Marcel Binz, Samuel J Gershman, Eric Schulz, and Dominik Endres. Heuristics from bounded meta-learned inference. Psychological review, 2022

work page 2022

[29] [29]

People construct simplified mental representations to plan

Mark K Ho, David Abel, Carlos G Correa, Michael L Littman, Jonathan D Cohen, and Thomas L Griffiths. People construct simplified mental representations to plan. Nature, 606(7912):129– 136, 2022. 11

work page 2022

[30] [30]

Bertsekas

D. Bertsekas. A Course in Reinforcement Learning. Athena Scientific, 2023

work page 2023

[31] [31]

Robust control under uncertainty via bounded ratio- nality and differential privacy

Vincent Pacelli and Anirudha Majumdar. Robust control under uncertainty via bounded ratio- nality and differential privacy. In 2022 International Conference on Robotics and Automation (ICRA), pages 3467–3474. IEEE, 2022

work page 2022

[32] [32]

The neural costs of optimal control

Samuel Gershman and Robert Wilson. The neural costs of optimal control. Advances in neural information processing systems, 23, 2010

work page 2010

[33] [33]

Bounded Planning in Passive POMDPs

Roy Fox and Naftali Tishby. Bounded planning in passive pomdps. arXiv preprint arXiv:1206.6405, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[34] [34]

Information-Theoretic Bounded Rationality

Pedro A Ortega, Daniel A Braun, Justin Dyer, Kee-Eung Kim, and Naftali Tishby. Information- theoretic bounded rationality. arXiv preprint arXiv:1512.06789, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[35] [35]

Adaptive coding for dynamic sensory inference

Wiktor F Młynarski and Ann M Hermundstad. Adaptive coding for dynamic sensory inference. eLife, 7:e32055, jul 2018

work page 2018

[36] [36]

Theoretical perspectives on active sensing

Scott Cheng-Hsin Yang, Daniel M Wolpert, and Máté Lengyel. Theoretical perspectives on active sensing. Current Opinion in Behavioral Sciences, 11:100–108, 2016. Computational modeling

work page 2016

[37] [37]

Horvitz and Matthew Barry

Eric J. Horvitz and Matthew Barry. Display of information for time-critical decision making, 2013

work page 2013

[38] [38]

Path integral control and bounded rationality

Daniel A Braun, Pedro A Ortega, Evangelos Theodorou, and Stefan Schaal. Path integral control and bounded rationality. In 2011 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), pages 202–209. IEEE, 2011

work page 2011

[39] [39]

Reinforcement learning and control as probabilistic inference: Tutorial and review, 2018

Sergey Levine. Reinforcement learning and control as probabilistic inference: Tutorial and review, 2018

work page 2018

[40] [40]

Bayesian inference with probabilistic population codes

Wei Ji Ma, Jeffrey M Beck, Peter E Latham, and Alexandre Pouget. Bayesian inference with probabilistic population codes. Nature neuroscience, 9(11):1432–1438, 2006

work page 2006

[41] [41]

Interpreting neural response variability as monte carlo sampling of the posterior

Patrik Hoyer and Aapo Hyvärinen. Interpreting neural response variability as monte carlo sampling of the posterior. Advances in neural information processing systems, 15, 2002

work page 2002

[42] [42]

Statistically optimal perception and learning: from behavior to neural representations

József Fiser, Pietro Berkes, Gerg˝o Orbán, and Máté Lengyel. Statistically optimal perception and learning: from behavior to neural representations. Trends in cognitive sciences, 14(3):119–130, 2010

work page 2010

[43] [43]

Nonlinear bayesian filtering and learning: a neuronal dynamics for perception.Scientific reports, 7(1):8722, 2017

Anna Kutschireiter, Simone Carlo Surace, Henning Sprekeler, and Jean-Pascal Pfister. Nonlinear bayesian filtering and learning: a neuronal dynamics for perception.Scientific reports, 7(1):8722, 2017

work page 2017

[44] [44]

Dynamic programming and optimal control: Volume I, volume 4

Dimitri Bertsekas. Dynamic programming and optimal control: Volume I, volume 4. Athena scientific, 2012

work page 2012

[45] [45]

Marginalization in neural circuits with divisive normalization

Jeffrey M Beck, Peter E Latham, and Alexandre Pouget. Marginalization in neural circuits with divisive normalization. Journal of Neuroscience, 31(43):15310–15319, 2011. A The neural cost of decreasing uncertainty In this Appendix, we detail how we quantify the number of action potentials a spiking neural network uses to perform recursive Bayesian inferenc...

work page 2011

[46] [46]

β tX i=0 αi yt−i − xt y0:t #2 = var(xt|y0:t) + E

is: νout t = ζ ∥ζ∥ 1 ˜r β + ξ ∥ξ∥ α (ξ · rout t−1) + ξ · rin t + c · 1 (8) Equation 8 holds for arbitrary vectors ξ and ζ as long as they are orthogonal to each other and to the vector 1. We use ξk = cos 2πk N /N and ζk = cos 4πk N /N, where N is the number of neurons in the recurrent layer. The term c · 1 in Equation 8 is an offset that ensures positive ...

work page