Recognition: no theorem link
A Random Rule Model
Pith reviewed 2026-05-15 16:56 UTC · model grok-4.3
The pith
Stochastic choice arises from switching among a small library of deterministic decision rules weighted by menu characteristics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We model stochastic choice as environment-dependent switching among a small library of deterministic decision rules. A Random Rule Model generates menu-level choice probabilities via named, interpretable rules weighted by observable menu characteristics. Identification has a two-step structure: within-feature decisive-side variation identifies relative rule weights; cross-feature richness identifies the gate. Applied to binary lottery choices, the estimated weights concentrate on a small subset of rules and shift systematically with complexity and dispersion asymmetry. The model closes nearly all of the prediction gap to a flexible neural-network benchmark, while remaining interpretable, and
What carries the argument
The Random Rule Model, which forms menu-level choice probabilities by weighting a fixed library of deterministic decision rules according to observable menu characteristics, using a two-step identification that separates rule weights from the switching gate.
If this is right
- Rule weights concentrate on a small subset of the available rules.
- Weights shift systematically with menu complexity and dispersion asymmetry.
- The model nearly closes the out-of-sample prediction gap relative to a flexible neural-network benchmark.
- The model remains restrictive under permutation diagnostics and portable to independent data.
Where Pith is reading between the lines
- The same rule-switching structure could be tested in non-lottery domains such as consumer goods or policy choices by collecting menu-feature data.
- Standard behavioral models may emerge as special cases once the rule library and gate are allowed to depend on menu observables.
- Direct experimental manipulation of complexity and asymmetry could test whether the predicted shifts in rule weights appear in new choices.
Load-bearing premise
That stochastic choice is generated by environment-dependent switching among a small library of deterministic decision rules and that the two-step identification using within-feature and cross-feature variation correctly recovers the rule weights and the gate.
What would settle it
Applying the estimated model to fresh binary lottery data and finding that the recovered rule weights do not concentrate on a small subset or that the model no longer accounts for most of the neural-network prediction gap.
read the original abstract
We model stochastic choice as environment-dependent switching among a small library of deterministic decision rules. A Random Rule Model generates menu-level choice probabilities via named, interpretable rules weighted by observable menu characteristics. Identification has a two-step structure: within-feature decisive-side variation identifies relative rule weights; cross-feature richness identifies the gate. Applied to binary lottery choices, the estimated weights concentrate on a small subset of rules and shift systematically with complexity and dispersion asymmetry. The model closes nearly all of the prediction gap to a flexible neural-network benchmark, while remaining interpretable, restrictive under permutation diagnostics, and portable to an independent dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Random Rule Model in which stochastic choice arises from environment-dependent switching among a small library of deterministic decision rules. Menu-level choice probabilities are generated by weighting these rules according to observable menu characteristics. Identification proceeds in two steps: within-feature decisive-side variation identifies relative rule weights, while cross-feature richness identifies the gate function. Applied to binary lottery choices, the estimated weights concentrate on a small subset of rules and shift systematically with complexity and dispersion asymmetry. The model closes nearly all of the prediction gap to a flexible neural-network benchmark, remains interpretable, passes permutation diagnostics, and is portable to an independent dataset.
Significance. If the two-step identification is valid, the paper supplies an interpretable, restrictive alternative to black-box models that still achieves near-equivalent predictive performance. The concentration of weights on few rules, systematic shifts with menu features, and out-of-sample portability are potentially important contributions to modeling stochastic choice in discrete settings.
major comments (2)
- [Identification] Identification section: The claim that cross-feature richness separately identifies the gate function rests on the assumption of sufficient independent variation across features. In binary lottery menus, which contain only a small number of features (probabilities and outcomes for two options), this variation may be too sparse or correlated to cleanly separate the gate from the rule weights, risking contamination of the estimated weights. This is load-bearing for the central performance claims relative to the neural-network benchmark.
- [Empirical results] Empirical application: The statement that weights 'concentrate on a small subset of rules' and 'close nearly all of the prediction gap' requires explicit reporting of the exact rule library, the numerical metrics (e.g., log-likelihood or hit rates with standard errors), and the precise definition of the neural-network benchmark. Without these details, it is impossible to evaluate whether the fit reflects genuine separation or flexibility within the model class.
minor comments (1)
- [Abstract] The abstract would be clearer if it named the specific rules in the library and the source of the binary lottery data.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive comments. We address each major comment below and describe the revisions we will undertake to strengthen the identification argument and empirical transparency.
read point-by-point responses
-
Referee: [Identification] Identification section: The claim that cross-feature richness separately identifies the gate function rests on the assumption of sufficient independent variation across features. In binary lottery menus, which contain only a small number of features (probabilities and outcomes for two options), this variation may be too sparse or correlated to cleanly separate the gate from the rule weights, risking contamination of the estimated weights. This is load-bearing for the central performance claims relative to the neural-network benchmark.
Authors: We acknowledge the importance of verifying sufficient independent variation. Although the menus involve only probabilities and payoffs, the dataset contains substantial menu-to-menu heterogeneity in these features, which we argue supplies the required cross-feature richness. To make this explicit, the revised manuscript will include Monte Carlo experiments calibrated to the empirical distribution of menu characteristics; these simulations will show that the two-step procedure recovers the gate parameters and rule weights with negligible bias. We will also report the correlation matrix and variance inflation factors among the menu features to document that multicollinearity does not prevent separation. These additions will directly address the concern while preserving the original identification logic. revision: yes
-
Referee: [Empirical results] Empirical application: The statement that weights 'concentrate on a small subset of rules' and 'close nearly all of the prediction gap' requires explicit reporting of the exact rule library, the numerical metrics (e.g., log-likelihood or hit rates with standard errors), and the precise definition of the neural-network benchmark. Without these details, it is impossible to evaluate whether the fit reflects genuine separation or flexibility within the model class.
Authors: We agree that greater precision is required. The revised version will (i) list every rule in the library together with its functional form, (ii) report in-sample and out-of-sample log-likelihoods and hit rates for the Random Rule Model and the neural-network benchmark, each accompanied by bootstrapped standard errors, and (iii) specify the neural-network architecture (number of hidden layers, units per layer, activation functions, regularization, and training algorithm) and the exact loss function used. These additions will allow readers to assess the magnitude of the performance gap and the degree of rule concentration directly. revision: yes
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper's identification relies on a two-step structure using distinct data variations (within-feature decisive-side for rule weights, cross-feature richness for the gate). No quoted equations or steps reduce predictions to fitted inputs by construction, nor involve self-definitional mappings, load-bearing self-citations, or smuggled ansatzes. The model is presented as fitted to data with out-of-sample portability checks, keeping the central claims independent of the inputs.
Axiom & Free-Parameter Ledger
free parameters (1)
- rule weights
axioms (1)
- domain assumption Stochastic choice is produced by switching among a small library of deterministic decision rules whose relative weights depend on observable menu characteristics.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.