Decision-Making under Combinatorial Risk
Pith reviewed 2026-06-27 16:55 UTC · model grok-4.3
The pith
People facing combinatorial risk choose based on key probability features like success increments rather than computing the full induced outcome distribution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the investment-allocation task participants prefer the option delivering the larger probability increment and, when increments are equal, the option with the higher initial success probability. Symbolic regression recovers descriptive models whose dominant terms are combinatorial-risk features such as after-investment success probability; these models account for observed choices without requiring exact evaluation of the full induced distribution. When the probability mass function is revealed, behavior shifts and is well fit by augmenting the feature model with a prospect-theoretic residual.
What carries the argument
Symbolic regression applied to choice data to discover compact models whose main inputs are combinatorial-risk features such as after-investment success probability.
If this is right
- Choices are driven primarily by the size of the probability increment produced by investment.
- When probability increments are identical, the higher initial success probability is preferred.
- Revealing the full induced PMF reduces responsiveness to combinatorial features and decreases choice variance.
- Augmenting the feature model with a prospect-theoretic residual accounts for behavior once the PMF is displayed.
Where Pith is reading between the lines
- Decision-support tools could emphasize probability increments and baseline rates rather than full outcome tables when users face multiple risky components.
- The same feature-based strategy may appear in other settings such as portfolio allocation or project selection where exact convolution of risks is impractical.
- A direct test would apply the recovered models to new combinatorial structures with different numbers of components or payoff correlations.
Load-bearing premise
The symbolic regression procedure yields models that reflect participants' actual cognitive process rather than merely fitting noise or task-specific artifacts in the collected data.
What would settle it
A new experiment in which the same participants face variants requiring explicit calculation of the full induced distribution and produce choice patterns that deviate systematically from the feature-based models recovered here.
Figures
read the original abstract
Decision-making under risk is typically studied through single-shot lottery choices. Yet many real decisions involve combinatorial risk, where risk arises from multiple risky components, so the lottery over outcomes is induced rather than given outright and can be costly to evaluate exactly. We introduce an investment-allocation task to study decision under combinatorial risk, where investing in a component raises its success probability and thereby reshapes the outcome distribution. Participants favor the option with the larger probability increment, and, when increments are equal, the option with the higher initial success probability. Revealing the induced probability mass function (PMF) substantially changes behavior, making participants less responsive to combinatorial-risk features and reducing choice variance. To explain these patterns, we move beyond standard benchmarks and hand-crafted hypotheses with symbolic regression to discover compact descriptive models. The discovered models rely mainly on combinatorial-risk features, such as the after-investment success probability, rather than exact evaluation of the full induced distribution. Behavior under the displayed PMF is then well explained by augmenting this model with a prospect-theoretic residual model. The results show that people navigate combinatorial risk primarily through its core features, shifting toward lottery valuation only when the induced PMF is displayed.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces an investment-allocation task to examine decision-making under combinatorial risk, where risk is induced by multiple components rather than given as a direct lottery. Participants prefer options with larger probability increments after investment and, when increments are equal, those with higher initial success probabilities. Revealing the induced PMF reduces responsiveness to these combinatorial features and lowers choice variance. Symbolic regression is used to discover compact models that rely primarily on combinatorial-risk features (e.g., after-investment success probability) instead of exact evaluation of the full induced distribution; behavior in the PMF-display condition is then explained by augmenting the model with a prospect-theoretic residual.
Significance. If the symbolic regression yields models that are robustly validated as descriptive of cognitive processes (rather than experimental artifacts), the work would advance understanding of how people approximate complex induced risks via core features, with implications for behavioral decision theory, bounded rationality, and applications in finance or engineering where combinatorial risks are common. The contrast between feature-based and distribution-based valuation when PMF is displayed could inform hybrid models of risk processing.
major comments (2)
- [Symbolic regression and model discovery] The central claim that participants rely on combinatorial-risk features (rather than full PMF evaluation) rests on the symbolic regression procedure. The abstract and reported method provide no details on out-of-sample validation, held-out testing, regularization, or explicit comparison against null models that ignore combinatorial structure; without these, the discovered expressions may simply recover the lowest-complexity functions correlated with choices due to the task's own feature definitions rather than reflecting cognitive mechanisms.
- [Behavioral results] No information is given on sample size, statistical tests establishing the reported preferences (larger increment, higher initial probability), effect sizes for the PMF-display manipulation, or controls for multiple comparisons. These omissions make it impossible to assess the reliability of the behavioral patterns that the symbolic models are intended to explain.
minor comments (1)
- [Model augmentation] The description of how the prospect-theoretic residual is combined with the combinatorial model (additive? multiplicative?) is not specified in sufficient detail to allow replication or assessment of identifiability.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below, indicating revisions that will strengthen the manuscript's transparency on methods and results.
read point-by-point responses
-
Referee: [Symbolic regression and model discovery] The central claim that participants rely on combinatorial-risk features (rather than full PMF evaluation) rests on the symbolic regression procedure. The abstract and reported method provide no details on out-of-sample validation, held-out testing, regularization, or explicit comparison against null models that ignore combinatorial structure; without these, the discovered expressions may simply recover the lowest-complexity functions correlated with choices due to the task's own feature definitions rather than reflecting cognitive mechanisms.
Authors: We agree that additional methodological details are needed to substantiate the symbolic regression results. The current manuscript describes the use of symbolic regression for discovering compact models but does not report the validation steps in sufficient depth. We will revise the methods section to include: (i) 5-fold cross-validation with held-out testing on 20% of trials, (ii) explicit regularization via the algorithm's complexity penalty, and (iii) direct comparisons against null models (e.g., logistic regression using only non-combinatorial features and a uniform random baseline). These additions will demonstrate that the discovered expressions outperform the nulls on out-of-sample log-likelihood and AIC, supporting that they capture task-relevant cognitive features rather than artifacts of feature definitions. revision: yes
-
Referee: [Behavioral results] No information is given on sample size, statistical tests establishing the reported preferences (larger increment, higher initial probability), effect sizes for the PMF-display manipulation, or controls for multiple comparisons. These omissions make it impossible to assess the reliability of the behavioral patterns that the symbolic models are intended to explain.
Authors: We acknowledge that the main text does not explicitly report these statistical details, which are required for evaluating the behavioral findings. We will revise the methods and results sections to state the sample size, the specific tests (paired t-tests for the increment and initial-probability preferences), effect sizes (Cohen's d), and the multiple-comparison procedure (FDR correction). These elements were part of the data-analysis pipeline but will now be presented clearly in the main manuscript to allow readers to assess reliability. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper applies symbolic regression to choice data collected from a custom investment task in order to discover compact descriptive models. This is a standard data-driven fitting procedure whose output is not equivalent to its inputs by construction, nor does the provided text contain self-definitional equations, load-bearing self-citations, uniqueness theorems imported from prior author work, or any other enumerated circular pattern. The central claim that participants rely on combinatorial features is presented as an empirical finding from the regression rather than a deductive result forced by the task definition itself. The subsequent augmentation with a prospect-theoretic residual is likewise described as an explanatory addition, not a tautological renaming. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Econometrica , volume=
Prospect theory: An analysis of decision under risk , author=. Econometrica , volume=
-
[2]
Cognitive psychology , volume=
On the shape of the probability weighting function , author=. Cognitive psychology , volume=. 1999 , publisher=
1999
-
[3]
Journal of Risk and uncertainty , volume=
Advances in prospect theory: Cumulative representation of uncertainty , author=. Journal of Risk and uncertainty , volume=. 1992 , publisher=
1992
-
[4]
Science , volume=
Using large-scale experiments and machine learning to discover theories of human decision-making , author=. Science , volume=. 2021 , publisher=
2021
-
[5]
International conference on machine learning , pages=
Cognitive model priors for predicting human decisions , author=. International conference on machine learning , pages=. 2019 , organization=
2019
-
[6]
Topics in Cognitive Science , year=
Local search and the evolution of world models , author=. Topics in Cognitive Science , year=
-
[7]
2009 , publisher=
Decision theory: Principles and approaches , author=. 2009 , publisher=
2009
-
[8]
Econometrica: Journal of the Econometric Society , pages=
Risk Aversion in the Small and in the Large , author=. Econometrica: Journal of the Econometric Society , pages=. 1964 , publisher=
1964
-
[9]
Wiley encyclopedia of operations research and management science , year=
Overweighting of small probabilities , author=. Wiley encyclopedia of operations research and management science , year=
-
[10]
Petersburg Paradox , author=
A Resource-Rational, Process-Level Account of the St. Petersburg Paradox , author=. Topics in Cognitive Science , volume=. 2020 , publisher=
2020
-
[11]
Nature communications , volume=
Rationally inattentive intertemporal choice , author=. Nature communications , volume=. 2020 , publisher=
2020
-
[12]
Current Opinion in Behavioral Sciences , volume=
Resource-rational decision making , author=. Current Opinion in Behavioral Sciences , volume=. 2021 , publisher=
2021
-
[13]
, author=
Computation-limited Bayesian updating: A resource-rational analysis of approximate Bayesian inference. , author=. Psychological Review , year=
-
[14]
Proceedings of the National Academy of Sciences , volume=
Optimal utility and probability functions for agents with finite computational precision , author=. Proceedings of the National Academy of Sciences , volume=. 2021 , publisher=
2021
-
[15]
Management Science , volume=
Decisions under uncertainty as bayesian inference on choice options , author=. Management Science , volume=. 2024 , publisher=
2024
-
[16]
American Economic Review , volume=
Perceiving prospects properly , author=. American Economic Review , volume=. 2016 , publisher=
2016
-
[17]
The Quarterly Journal of Economics , volume=
Efficient coding and risky choice , author=. The Quarterly Journal of Economics , volume=. 2022 , publisher=
2022
-
[18]
, author=
From anomalies to forecasts: Toward a descriptive model of decisions under risk, under ambiguity, and from experience. , author=. Psychological review , volume=. 2017 , publisher=
2017
-
[19]
CS294A Lecture notes , volume=
Sparse autoencoder , author=. CS294A Lecture notes , volume=
-
[20]
The Thirteenth International Conference on Learning Representations , year=
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice , author=. The Thirteenth International Conference on Learning Representations , year=
-
[21]
Transactions on Machine Learning Research , year=
Symbolic Regression is NP-hard , author=. Transactions on Machine Learning Research , year=
-
[22]
International Conference on Learning Representations , year=
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients , author=. International Conference on Learning Representations , year=
-
[23]
Advances in neural information processing systems , volume=
Discovering symbolic models from deep learning with inductive biases , author=. Advances in neural information processing systems , volume=
-
[24]
Nature Computational Science , pages=
Discovering physical laws with parallel symbolic enumeration , author=. Nature Computational Science , pages=. 2025 , publisher=
2025
-
[25]
Science advances , volume=
AI Feynman: A physics-inspired method for symbolic regression , author=. Science advances , volume=. 2020 , publisher=
2020
-
[26]
Advances in Neural Information Processing Systems , volume=
AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity , author=. Advances in Neural Information Processing Systems , volume=
-
[27]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Dimension Reduction for Symbolic Regression , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[28]
Nature Machine Intelligence , pages=
A neural symbolic model for space physics , author=. Nature Machine Intelligence , pages=. 2025 , publisher=
2025
-
[29]
Hou and Max Tegmark , booktitle=
Ziming Liu and Yixuan Wang and Sachin Vaidya and Fabian Ruehle and James Halverson and Marin Soljacic and Thomas Y. Hou and Max Tegmark , booktitle=. 2025 , url=
2025
-
[30]
arXiv , langid =:2408.10205 , primaryclass =
Kan 2.0: Kolmogorov-arnold networks meet science , author=. arXiv preprint arXiv:2408.10205 , year=
-
[31]
Advances in Neural Information Processing Systems , volume=
Symbolic regression with a learned concept library , author=. Advances in Neural Information Processing Systems , volume=
-
[32]
Proceedings of the National Academy of Sciences , volume=
SR-LLM: An incremental symbolic regression framework driven by LLM-based retrieval-augmented generation , author=. Proceedings of the National Academy of Sciences , volume=. 2025 , publisher=
2025
-
[33]
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=
In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=
-
[34]
International Conference on Artificial Intelligence and Statistics , pages=
Shape Arithmetic Expressions: Advancing Scientific Discovery Beyond Closed-Form Equations , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=
2024
-
[35]
Advances in Neural Information Processing Systems , volume=
End-to-end symbolic regression with transformers , author=. Advances in Neural Information Processing Systems , volume=
-
[36]
The Eleventh International Conference on Learning Representations , year=
Deep Generative Symbolic Regression , author=. The Eleventh International Conference on Learning Representations , year=
-
[37]
INFORMS Journal on Computing , volume=
Learning symbolic expressions: Mixed-integer formulations, cuts, and heuristics , author=. INFORMS Journal on Computing , volume=. 2023 , publisher=
2023
-
[38]
IEEE transactions on evolutionary computation , volume=
A fast and elitist multiobjective genetic algorithm: NSGA-II , author=. IEEE transactions on evolutionary computation , volume=. 2002 , publisher=
2002
-
[39]
Proceedings of the 29th International Coference on International Conference on Machine Learning , pages=
Revisiting k-means: new algorithms via Bayesian nonparametrics , author=. Proceedings of the 29th International Coference on International Conference on Machine Learning , pages=
-
[40]
International Conference on Computer Aided Systems Theory , pages=
Complexity measures for multi-objective symbolic regression , author=. International Conference on Computer Aided Systems Theory , pages=. 2015 , organization=
2015
-
[41]
Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl
Interpretable machine learning for science with PySR and SymbolicRegression. jl , author=. arXiv preprint arXiv:2305.01582 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[42]
Proceedings of the 2020 genetic and evolutionary computation conference companion , pages=
Operon C++ an efficient genetic programming framework for symbolic regression , author=. Proceedings of the 2020 genetic and evolutionary computation conference companion , pages=
2020
-
[43]
1944 , publisher=
Theory of games and economic behavior , author=. 1944 , publisher=
1944
-
[44]
Econometrica , volume=
Risk aversion in the small and in the large , author=. Econometrica , volume=. 1976 , publisher=
1976
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.