arxiv: 2605.03184 · v2 · submitted 2026-05-04 · 💻 cs.IT · math.IT· q-fin.MF· q-fin.PM

Recognition: 2 theorem links

· Lean Theorem

Single-Period Portfolio Selection via Information Projection

Bo-Yu Yang , Michael Gastpar

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:48 UTC · model grok-4.3

classification 💻 cs.IT math.ITq-fin.MFq-fin.PM

keywords portfolio selectionCRRA utilityRényi divergenceinformation projectioncertainty equivalentBlahut-Arimoto algorithmgrowth rate

0 comments

The pith

CRRA portfolio selection is equivalent to a Rényi information-projection problem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the certainty-equivalent growth rate achieved by a constant-relative-risk-aversion investor decomposes exactly into three information-theoretic terms: a portfolio-induced Rényi divergence, the Rényi entropy of a risk-tilted market measure, and a log-partition function. The order of the Rényi divergence is precisely the investor's relative risk-aversion parameter. This identity converts the portfolio-optimization task into the problem of projecting the market law onto the set of attainable wealth distributions under the Rényi divergence. The resulting view yields a practical alternating algorithm that updates an auxiliary distribution in closed form and updates the portfolio weights via a KL-type step.

Core claim

Under the sole assumption that the market payoff vector has finite support, the certainty-equivalent growth rate under CRRA utility decomposes into a portfolio-induced Rényi divergence term, a Rényi entropy term of the risk-tilted market law, and a log-partition term. Consequently, CRRA portfolio selection is exactly equivalent to a Rényi information-projection problem whose order equals the investor's relative risk aversion. The variational representation of Rényi divergence then produces a Blahut-Arimoto-style alternating optimization whose auxiliary update is closed-form and whose portfolio step is a KL projection.

What carries the argument

Rényi information projection of the market payoff distribution onto the convex set of attainable wealth distributions, with the projection order set to the investor's relative risk aversion.

If this is right

The Rényi order parameter is identical to the investor's relative risk aversion.
Portfolio optimization reduces to an alternating procedure with a closed-form auxiliary update and a KL-type portfolio step.
In the low risk-aversion regime the alternating procedure empirically converges in fewer iterations than direct CRRA utility maximization or Cover's universal portfolio method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same projection view may supply efficient algorithms for other single-period criteria whose growth rates admit variational representations as divergences.
Finite-support assumption can be tested directly on discrete market models; continuous extensions would require checking whether the decomposition survives suitable limits.
The equivalence links portfolio theory to rate-distortion theory, suggesting that known rate-distortion algorithms could be repurposed for risk-aversion parameters other than the Rényi order.

Load-bearing premise

The market payoff vector has finite support.

What would settle it

Compute the certainty-equivalent growth rate for any fixed CRRA parameter and any finite-support payoff distribution; if it does not equal the sum of the three claimed Rényi terms for the optimizing portfolio, or if the optimizing portfolio fails to solve the corresponding Rényi projection, the claimed equivalence is false.

Figures

Figures reproduced from arXiv: 2605.03184 by Bo-Yu Yang, Michael Gastpar.

**Figure 1.** Figure 1: Numerical comparison for CRRA portfolio selection. view at source ↗

read the original abstract

We study the single-period portfolio selection problem under Constant Relative Risk-Aversion (CRRA) utility through the information-theoretic lens. Assuming only that the market payoff vector has finite support, we show that the Certainty-Equivalent (CE) growth rate under CRRA utility can be decomposed into a portfolio-induced R\'enyi divergence term, a R\'enyi entropy term of the risk-tilted market law, and a log-partition term. In this setting, the R\'enyi order has a clear operational meaning: it exactly coincides with the investor's coefficient of relative risk aversion. We further show that CRRA portfolio selection is equivalent to a R\'enyi information-projection problem. Using a variational representation of R\'enyi divergence, we obtain a Blahut-Arimoto-style alternating optimization with a closed-form auxiliary update and a KL-type portfolio step. In the low risk-aversion regime, this method empirically requires fewer iterations than both direct CRRA utility optimization and Cover's method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows CRRA portfolio choice is equivalent to a Rényi divergence projection with the order matching risk aversion, yielding a practical alternating optimizer.

read the letter

The main point is that single-period CRRA portfolio selection reduces to a Rényi information projection problem, where the divergence order equals the relative risk aversion coefficient. This gives a decomposition of the certainty-equivalent growth rate into a portfolio-induced Rényi term, a Rényi entropy of the tilted market measure, and a log-partition function. The variational form then produces a Blahut-Arimoto-style alternating algorithm with a closed-form auxiliary update and a simple KL-type portfolio step. Under finite support on the payoff vector, the derivation stays clean and discrete with no extra regularity needed. The low risk-aversion experiments showing fewer iterations than direct CRRA optimization or Cover's method are a useful practical observation. The explicit link between Rényi order and CRRA is the genuinely new framing here, and the projection equivalence follows directly from standard definitions without circularity or fitted parameters. The finite-support assumption keeps everything rigorous but restricts the setting to discrete markets, which is a real limitation for continuous-return models even if it is standard in this literature. The iteration comparison is only empirical and limited to low risk aversion, so convergence rates or high-aversion behavior are not addressed. No internal contradictions appear in the steps from the CE expression to the projection form. This work is for researchers at the information theory and mathematical finance intersection who want a new algorithmic handle on utility maximization. A reader focused on portfolio algorithms or variational methods would find the alternating scheme worth trying. It deserves peer review because the core equivalence is grounded and the algorithmic payoff is concrete enough for referees to evaluate.

Referee Report

0 major / 3 minor

Summary. The paper studies single-period portfolio selection under CRRA utility with finite-support market payoffs. It decomposes the certainty-equivalent growth rate into a portfolio-induced Rényi divergence term, a Rényi entropy term of the risk-tilted market measure, and a log-partition term, where the Rényi order equals the relative risk aversion coefficient. This yields an equivalence between CRRA portfolio selection and a Rényi information-projection problem. A variational representation of the divergence produces a Blahut-Arimoto-style alternating optimization with closed-form auxiliary update and KL-type portfolio step; the method is reported to require fewer iterations than direct CRRA optimization or Cover's method in the low risk-aversion regime.

Significance. If the decomposition and equivalence are correct, the work supplies a clean information-theoretic reinterpretation of CRRA portfolio choice in which risk aversion acquires an operational meaning as the Rényi order. The finite-support assumption renders all quantities discrete and the variational representation directly applicable, enabling an alternating algorithm whose empirical speed advantage is plausible. The approach credits standard tools (variational Rényi representations, alternating optimization) while applying them to a portfolio problem in a parameter-free manner; this could facilitate extensions to other utilities or discrete-market settings.

minor comments (3)

[§3] §3 (or the section containing the main decomposition): the transition from the CE growth-rate expression to the three-term decomposition should include an explicit line-by-line derivation so that the identification of the Rényi order with relative risk aversion is immediately verifiable.
[Empirical section] The empirical comparison in the low risk-aversion regime reports fewer iterations but does not state the number of Monte-Carlo trials, the specific finite-support distributions, or the convergence tolerance used; these details are needed for reproducibility.
[Notation and preliminaries] Notation for the risk-tilted measure and the portfolio-induced divergence should be introduced with a single consistent symbol set before the main theorem to avoid later redefinition.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The referee's description accurately captures the paper's main results on the decomposition of the CRRA certainty-equivalent growth rate and its equivalence to Rényi information projection.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained from definitions

full rationale

The central claim decomposes the CRRA certainty-equivalent growth rate into portfolio-induced Rényi divergence, Rényi entropy of the risk-tilted measure, and log-partition term, with Rényi order equal to relative risk aversion, yielding equivalence to a Rényi projection problem. This follows directly from the standard variational representation of Rényi divergence applied to the finite-support market payoff vector and the definition of CRRA certainty equivalent; no fitted parameters are renamed as predictions, no self-citations are load-bearing for the equivalence, and no ansatz is smuggled in. The alternating optimization is a direct consequence of the variational form without reducing to the input by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the finite-support assumption for the payoff vector and on the standard definitions of CRRA utility and Rényi divergence; no free parameters or new entities are introduced.

axioms (1)

domain assumption Market payoff vector has finite support
Explicitly stated as the sole modeling assumption enabling the decomposition and equivalence.

pith-pipeline@v0.9.0 · 5475 in / 1218 out tokens · 50307 ms · 2026-05-12T01:48:37.526795+00:00 · methodology

Review history (3 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

G_ρu(W) = -D_ρu(˜p ∥ ¯q_b) - H_ρu(˜p) + log Z_q (Theorem 1); arg max_b E[u(W)] = arg min_b D_ρu(˜p ∥ ¯q_b) (Theorem 2); variational Blahut-Arimoto alternation (Theorem 3)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Rényi order ρ_u coincides with investor's coefficient of relative risk aversion; finite-support payoff matrix M_X

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 1 internal anchor

[1]

R. S. Sutton and A. G. Barto,Reinforcement Learning: An Introduction. MIT Press, 1998

work page 1998
[2]

Relative entropy policy search,

J. Peters, K. Mülling, and Y . Altun, “Relative entropy policy search,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 24, no. 1, 2010, pp. 1607–1612

work page 2010
[3]

Thompson sampling for contextual bandits with linear payoffs,

S. Agrawal and N. Goyal, “Thompson sampling for contextual bandits with linear payoffs,” inProceedings of the 30th International Conference on Machine Learning, 2013, pp. 127–135

work page 2013
[4]

An information-theoretic analysis of Thomp- son sampling,

D. Russo and B. Van Roy, “An information-theoretic analysis of Thomp- son sampling,”Journal of Machine Learning Research, vol. 17, no. 68, pp. 1–30, 2016

work page 2016
[5]

J. H. Cochrane,Asset Pricing: Revised Edition. Princeton University Press, 2009

work page 2009
[6]

Back,Asset Pricing and Portfolio Choice Theory

K. Back,Asset Pricing and Portfolio Choice Theory. Oxford University Press, 2017

work page 2017
[7]

J. Y . Campbell,Financial Decisions and Markets: A Course in Asset Pricing. Princeton University Press, 2017

work page 2017
[8]

A new interpretation of information rate,

J. L. Kelly, “A new interpretation of information rate,”The Bell System Technical Journal, vol. 35, no. 4, pp. 917–926, 1956

work page 1956
[9]

Conditional Rényi divergences and horse betting,

C. Bleuler, A. Lapidoth, and C. Pfister, “Conditional Rényi divergences and horse betting,”Entropy, vol. 22, no. 3, p. 316, 2020

work page 2020
[10]

T. M. Cover and J. A. Thomas,Elements of Information Theory. Hoboken, NJ, USA: Wiley-Interscience, 2006

work page 2006
[11]

An algorithm for maximizing expected log investment return,

T. M. Cover, “An algorithm for maximizing expected log investment return,”IEEE Transactions on Information Theory, vol. 30, no. 2, pp. 369–373, 1984

work page 1984
[12]

Exposition of a new theory on the measurement of risk,

D. Bernoulli, “Exposition of a new theory on the measurement of risk,” Econometrica, vol. 22, no. 1, pp. 23–36, 1954, originally published in 1738; translated by L. Sommer

work page 1954
[13]

von Neumann and O

J. von Neumann and O. Morgenstern,Theory of Games and Economic Behavior. Princeton University Press, 1947

work page 1947
[14]

Optimal multiperiod portfolio policies,

J. Mossin, “Optimal multiperiod portfolio policies,”The Journal of Business, vol. 41, no. 2, pp. 215–229, 1968

work page 1968
[15]

I-divergence geometry of probability distributions and min- imization problems,

I. Csiszár, “I-divergence geometry of probability distributions and min- imization problems,”The Annals of Probability, vol. 3, no. 1, pp. 146– 158, 1975

work page 1975
[16]

Information geometry and alternating minimization procedures,

I. Csiszár and G. Tusnády, “Information geometry and alternating minimization procedures,”Statistics & Decisions, vol. Supplement Issue 1, pp. 205–237, 1984

work page 1984
[17]

Projection theorems for the Rényi diver- gence onα-convex sets,

M. A. Kumar and I. Sason, “Projection theorems for the Rényi diver- gence onα-convex sets,”IEEE Transactions on Information Theory, vol. 62, no. 9, pp. 4924–4935, 2016

work page 2016
[18]

α-mutual information,

S. Verdú, “α-mutual information,” in2015 Information Theory and Applications Workshop (ITA). IEEE, 2015, pp. 1–6

work page 2015
[19]

On measures of entropy and information,

A. Rényi, “On measures of entropy and information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, vol. 4. University of California Press, 1961, pp. 547–562

work page 1961
[20]

Rényi divergence and Kullback-Leibler divergence,

T. van Erven and P. Harremoës, “Rényi divergence and Kullback-Leibler divergence,”IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 3797–3820, 2014

work page 2014
[21]

A note on a characterization of Rényi measures and its re- lation to composite hypothesis testing,

O. Shayevitz, “A note on a characterization of Rényi measures and its re- lation to composite hypothesis testing,”arXiv preprint arXiv:1012.4401, 2010

work page arXiv 2010
[22]

Sibsonα-mutual information and its variational representations,

A. R. Esposito, M. Gastpar, and I. Issa, “Sibsonα-mutual information and its variational representations,”IEEE Transactions on Information Theory, 2025

work page 2025
[23]

Minimization of functions having Lipschitz continuous first partial derivatives,

L. Armijo, “Minimization of functions having Lipschitz continuous first partial derivatives,”Pacific Journal of Mathematics, vol. 16, no. 1, pp. 1–3, 1966. [Online]. Available: https://msp.org/pjm/1966/ 16-1/pjm-v16-n1-p01-p.pdf

work page 1966
[24]

Convergence of the exponentiated gradient method with Armijo line search,

Y .-H. Li and V . Cevher, “Convergence of the exponentiated gradient method with Armijo line search,”Journal of Optimization Theory and Applications, vol. 181, no. 2, pp. 588–607, 2019. [Online]. Available: https://arxiv.org/abs/1712.08480

work page arXiv 2019
[25]

Lifetime portfolio selection by dynamic stochastic programming,

P. A. Samuelson, “Lifetime portfolio selection by dynamic stochastic programming,”The Review of Economics and Statistics, vol. 51, no. 3, pp. 239–246, Aug. 1969

work page 1969
[26]

Lifetime portfolio selection under uncertainty: The continuous-time case,

R. C. Merton, “Lifetime portfolio selection under uncertainty: The continuous-time case,”The Review of Economics and Statistics, vol. 51, no. 3, pp. 247–257, 1969

work page 1969
[27]

Variable selection for portfolio choice,

Y . Aït-Sahalia and M. W. Brandt, “Variable selection for portfolio choice,”The Journal of Finance, vol. 56, no. 4, pp. 1297–1351, 2001

work page 2001
[28]

Strategic asset allocation in a continuous-time var model,

J. Y . Campbell, G. Chacko, J. Rodriguez, and L. M. Viceira, “Strategic asset allocation in a continuous-time var model,”Journal of Economic Dynamics and Control, vol. 28, no. 11, pp. 2195–2214, 2004

work page 2004
[29]

Dynamic portfolio choice with return predictability and transaction costs,

G. Ma, C. C. Siu, and S. P. Zhu, “Dynamic portfolio choice with return predictability and transaction costs,”European Journal of Operational Research, vol. 278, no. 3, pp. 976–988, 2019

work page 2019
[30]

D. P. Palomar,Portfolio Optimization. Cambridge University Press, 2025

work page 2025
[31]

Portfolio selection,

H. Markowitz, “Portfolio selection,”The Journal of Finance, vol. 7, no. 1, pp. 77–91, 1952

work page 1952
[32]

A Modern Introduction to Online Learning

F. Orabona, “A modern introduction to online learning,”arXiv preprint arXiv:1912.13213, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1912
[33]

Universal portfolios,

T. M. Cover, “Universal portfolios,”Mathematical Finance, vol. 1, no. 1, pp. 1–29, 1991

work page 1991
[34]

Universal sequential learning and decision from individual data sequences,

N. Merhav and M. Feder, “Universal sequential learning and decision from individual data sequences,” inProceedings of the Fifth Annual Workshop on Computational Learning Theory, 1992, pp. 413–427

work page 1992
[35]

On-line portfolio selection using multiplicative updates,

D. P. Helmbold, R. E. Schapire, Y . Singer, and M. K. Warmuth, “On-line portfolio selection using multiplicative updates,”Mathematical Finance, vol. 8, no. 4, pp. 325–347, 1998

work page 1998
[36]

Data-dependent bounds for online portfolio selection without Lipschitzness and smoothness,

C.-E. Tsai, Y .-T. Lin, and Y .-H. Li, “Data-dependent bounds for online portfolio selection without Lipschitzness and smoothness,”Advances in Neural Information Processing Systems, vol. 36, pp. 62 764–62 791, 2023

work page 2023
[37]

Efficient and near- optimal online portfolio selection,

R. Jézéquel, D. M. Ostrovskii, and P. Gaillard, “Efficient and near- optimal online portfolio selection,”Mathematics of Operations Research, 2025, articles in advance

work page 2025
[38]

Capital asset prices: A theory of market equilibrium under conditions of risk,

W. F. Sharpe, “Capital asset prices: A theory of market equilibrium under conditions of risk,”The Journal of Finance, vol. 19, no. 3, pp. 425–442, 1964

work page 1964
[39]

Conditional skewness in asset pricing tests,

C. R. Harvey and A. Siddique, “Conditional skewness in asset pricing tests,”The Journal of Finance, vol. 55, no. 3, pp. 1263–1295, 2000

work page 2000
[40]

Optimal portfolio allocation under higher moments,

E. Jondeau and M. Rockinger, “Optimal portfolio allocation under higher moments,”European Financial Management, vol. 12, no. 1, pp. 29–55, 2006

work page 2006
[41]

Optimal portfolio allocation with higher moments,

J. Cvitani ´c, V . Polimenis, and F. Zapatero, “Optimal portfolio allocation with higher moments,”Annals of Finance, vol. 4, no. 1, pp. 1–28, 2008. APPENDIXA CALCULATION INEXAMPLE1 Specifically, the partition functionZ q in Example 1 can be calculated as Zq = X x ⟨b,x⟩ (a) =c X ¯ x ⟨b,¯ x⟩ (b) =c·2 mE¯X i.i.d. ∼Ber( 1 2 ) ⟨b, ¯X⟩ =c·2 m ⟨b,E ¯X i.i.d. ∼Be...

work page 2008