pith. sign in

arxiv: 2605.06941 · v1 · submitted 2026-05-07 · 💻 cs.LG · math.OC

Causal-Aware Foundation-Model for Bilevel Optimization in Discrete Choice Settings

Pith reviewed 2026-05-11 01:05 UTC · model grok-4.3

classification 💻 cs.LG math.OC
keywords bilevel optimizationdiscrete choice modelsfoundation modelspricing optimizationin-context learningimitation learningprice elasticityrevenue management
0
0 comments X

The pith

A foundation model trained on simulated discrete choice data learns to set prices and assortments in new environments by retrieving elasticity priors and respecting constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a causal-aware foundation-model approach for real-time bilevel pricing decisions where a provider picks an assortment and prices while heterogeneous customers accept or reject based on their own preferences. It trains a constrained triple-head network called C3PO on simulated data generated from classical discrete choice models, combining imitation learning for prices, multi-task revenue prediction, and in-context retrieval of elasticity information from economics literature. The resulting model produces recommendations for entirely new choice settings without seeing the underlying preference structure. Gains appear in simulated tests and real deployments across healthcare, tender pricing, airline ancillaries, and other domains, and the improvements grow larger when customers are more price-sensitive. This matters for any setting that needs fast, constraint-aware pricing without full knowledge of customer utilities.

Core claim

The C3PO network solves the bilevel problem by integrating imitation learning of prices, multi-task learning of revenue responses, and in-context learning of price elasticity, all while enforcing business constraints. Trained solely on simulated customer segments and counterfactual pairs drawn from multiple classical discrete choice models, the network produces effective pricing recommendations for randomly generated choice environments that provide no access to the true preference structure. It consistently raises pricing KPIs, with larger gains as customer price sensitivity increases, and the tuned model yields substantial improvements when deployed on real-world problems in healthcare, t

What carries the argument

The constrained triple-head price optimization (C3PO) network, which performs simultaneous price imitation, revenue response prediction, and in-context elasticity prior retrieval while projecting outputs onto feasible business constraints.

If this is right

  • Pricing KPIs rise consistently, and the size of the rise increases with measured customer price sensitivity.
  • The same trained network produces usable recommendations for previously unseen products and choice environments.
  • Real deployments in healthcare, tender pricing, and airline ancillary services deliver measurable revenue or margin gains across products and markets.
  • Business constraints are satisfied by construction because the network projects its outputs onto the feasible set at inference time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same simulation-plus-in-context pattern could be tested on other bilevel problems such as dynamic inventory allocation or personalized recommendation with capacity limits.
  • Performance may degrade if real customer responses contain systematic deviations from all classical discrete choice families that were not captured in the training simulations.
  • Adding more recent or domain-specific elasticity sources beyond the static literature corpus could further widen the observed gains.

Load-bearing premise

That data simulated from classical discrete choice models plus in-context retrieval of elasticity priors from behavioral economics literature is sufficient for the network to generalize to real customer behavior in new choice environments without access to the underlying preference structure.

What would settle it

Run the deployed model on a live market whose observed acceptance rates deviate sharply from predictions of any classical discrete choice model (for example, strong herding or reference-point effects) and check whether the KPI lift disappears or reverses.

Figures

Figures reproduced from arXiv: 2605.06941 by Jayant Kalagnanam, Markus Ettl, Shivaram Subramanian, Yingdong Lu, Zhengliang Xue.

Figure 1
Figure 1. Figure 1: High-level C3PO model architecture. elasticity prior provides a plausible elasticity range which anchors the price predictions; and (iii) Multi-task learning of the price-revenue curve: an auxiuliary revenue head learns to predict revenue for any price vector. This head is frozen and used as a reward signal to direct the price head towards higher revenue regions. The imitation learning module learns a repr… view at source ↗
Figure 2
Figure 2. Figure 2: Revenue-loss term as a function of dataset count, including a log-scaled y-axis and a [PITH_FULL_IMAGE:figures/full_fig_p022_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Training loss as a function of dataset count, including a log-scaled y-axis and a 50-point [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average price ordering constraint violation as a function of dataset count, including a linear [PITH_FULL_IMAGE:figures/full_fig_p023_4.png] view at source ↗
read the original abstract

We introduce a causal aware foundation-model framework for real time optimal decision making in discrete choice environments. We propose a constrained triple-head price optimization (C3PO) network to solve a bilevel decision problem in which a service provider selects an optimal assortment while heterogeneous users make personalized acceptance or rejection choices optimizing their own personalized preferences. C3PO integrates imitation learning of prices, multi-task learning of revenue responses, and in context learning of price elasticity to generate pricing recommendations while adhering to business constraints. During inference, frontier model prompting retrieves an enhanced elasticity prior for new products from behavioral economics literature, improving pricing effectiveness. We demonstrate strong in context learning performance using simulated, synthetic, and real-world datasets. C3PO is trained on simulated data generated from multiple classical discrete choice models in economics. The model is trained on data comprising simulated customer segments and counterfactual action and outcome pairs and evaluated on randomly generated choice environments with no access to the underlying preference structure. The trained model consistently improves the pricing KPIs, with gains increasing as customer price sensitivity increases. We also deploy the tuned foundation model for optimal pricing in real-world applications such as healthcare, tender pricing, airline ancillary pricing, and other domains, achieving substantial gains across multiple products, markets, and divisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

4 major / 1 minor

Summary. The paper introduces a causal-aware foundation-model framework called C3PO for real-time bilevel optimization in discrete choice settings. A service provider optimizes assortments and prices subject to constraints while heterogeneous users make personalized acceptance/rejection decisions; the network combines imitation learning of prices, multi-task revenue prediction, and in-context retrieval of price-elasticity priors from behavioral-economics literature via frontier-model prompting. The model is trained exclusively on counterfactual pairs simulated from classical discrete-choice models (logit, probit, etc.) and evaluated on randomly generated environments drawn from the same model families, with no access to true preference parameters. The authors claim consistent KPI improvements that increase with customer price sensitivity and report substantial gains after deploying the tuned model in healthcare, tender, and airline ancillary pricing.

Significance. If the empirical claims can be substantiated with quantitative metrics and proper validation, the approach would represent a practical bridge between simulation-based training and real-time constrained pricing, leveraging foundation-model in-context learning to incorporate external economic priors. The training on multiple classical choice models and the explicit handling of business constraints are constructive elements that could scale to other bilevel decision problems if generalization beyond the training distribution is demonstrated.

major comments (4)
  1. [Abstract] Abstract: the central empirical claims ('consistently improves the pricing KPIs, with gains increasing as customer price sensitivity increases' and 'achieving substantial gains across multiple products, markets, and divisions') are stated without any numerical results, baselines, error bars, ablation studies, or validation protocol, making it impossible to evaluate the magnitude or statistical reliability of the reported benefits.
  2. [Training and Evaluation] Training and Evaluation sections: data are generated from classical discrete-choice models and evaluation environments are 'randomly generated choice environments' drawn from the same model families; this protocol does not constitute an out-of-distribution test and therefore cannot substantiate the claim that the network generalizes to real customer behavior when the true utility parameters are inaccessible.
  3. [Deployment] Deployment claims: statements of successful real-world application in healthcare, tender pricing, and airline ancillary pricing are presented without any quantitative before/after KPIs, comparison to incumbent methods, or confirmation against observed acceptance rates, which is load-bearing for the assertion of practical utility.
  4. [Methodology] Methodology: the in-context elasticity prior retrieved via frontier-model prompting is described as improving pricing effectiveness, yet no ablation isolating the contribution of these priors, no sensitivity analysis to literature selection, and no causal identification strategy are reported, leaving the 'causal-aware' component unverified.
minor comments (1)
  1. The description of the constrained triple-head architecture (C3PO) would benefit from an explicit diagram or pseudocode showing how the three heads interact with the bilevel constraints during inference.

Simulated Author's Rebuttal

4 responses · 1 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where revisions will be made to improve clarity, substantiation, and transparency while preserving the core contributions of the work.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central empirical claims ('consistently improves the pricing KPIs, with gains increasing as customer price sensitivity increases' and 'achieving substantial gains across multiple products, markets, and divisions') are stated without any numerical results, baselines, error bars, ablation studies, or validation protocol, making it impossible to evaluate the magnitude or statistical reliability of the reported benefits.

    Authors: We agree that the abstract would be strengthened by including concrete quantitative details. In the revised version we will expand the abstract to summarize key results from the experiments, including average revenue uplifts (with ranges across sensitivity levels), comparisons against baselines such as myopic pricing and classical optimization, references to ablation studies, and a brief mention of the validation protocol and error bars from multiple simulation runs. These details are already present in Sections 4 and 5; the abstract will now foreground them. revision: yes

  2. Referee: [Training and Evaluation] Training and Evaluation sections: data are generated from classical discrete-choice models and evaluation environments are 'randomly generated choice environments' drawn from the same model families; this protocol does not constitute an out-of-distribution test and therefore cannot substantiate the claim that the network generalizes to real customer behavior when the true utility parameters are inaccessible.

    Authors: The referee correctly notes that the evaluation remains within the family of classical discrete-choice models rather than testing true out-of-distribution real-world behavior. Our protocol is designed to evaluate performance when true preference parameters are unknown, which matches the practical setting. We will revise the manuscript to explicitly state this limitation, add a dedicated discussion of potential domain shift to real customer data, and include any anonymized real-world hold-out checks from the deployment cases. We believe the current results still demonstrate the value of the approach under the stated assumptions. revision: partial

  3. Referee: [Deployment] Deployment claims: statements of successful real-world application in healthcare, tender pricing, and airline ancillary pricing are presented without any quantitative before/after KPIs, comparison to incumbent methods, or confirmation against observed acceptance rates, which is load-bearing for the assertion of practical utility.

    Authors: We acknowledge that the deployment section currently lacks the quantitative detail needed for full evaluation. Because of confidentiality agreements, we cannot release specific before/after KPIs or direct numerical comparisons. In the revision we will expand the text to describe the validation process against observed acceptance rates, the constraint-handling outcomes, and high-level (non-proprietary) performance indicators. If this remains insufficient we are prepared to move the deployment claims to a supplementary note or qualify them more cautiously. revision: partial

  4. Referee: [Methodology] Methodology: the in-context elasticity prior retrieved via frontier-model prompting is described as improving pricing effectiveness, yet no ablation isolating the contribution of these priors, no sensitivity analysis to literature selection, and no causal identification strategy are reported, leaving the 'causal-aware' component unverified.

    Authors: We agree that the contribution of the in-context priors requires explicit verification. We will add a new ablation subsection comparing model performance with and without the retrieved elasticity priors, including sensitivity tests across different literature selections and prompting variations. We will also clarify that the causal-aware framing derives from training exclusively on counterfactual pairs generated by classical causal discrete-choice models together with the bilevel optimization structure; a short discussion of identification assumptions will be included. revision: yes

standing simulated objections not resolved
  • Specific numerical before/after KPIs and direct incumbent comparisons from the confidential real-world deployments cannot be provided.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes training C3PO on simulated customer segments and counterfactual pairs generated from classical discrete choice models, then evaluating on randomly generated environments from the same model families without access to underlying preferences. This is a standard supervised setup for testing imitation and in-context learning rather than a derivation that reduces by construction to its inputs. No equations, self-citations, or uniqueness theorems are invoked in the provided text to force the central claims; reported KPI improvements and real-world deployments are presented as empirical outcomes. The derivation chain remains self-contained against external benchmarks of simulated performance.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be extracted or audited. The framework implicitly rests on the fidelity of classical discrete-choice simulations and the relevance of retrieved economics literature, but these cannot be quantified here.

pith-pipeline@v0.9.0 · 5532 in / 1349 out tokens · 63584 ms · 2026-05-11T01:05:20.935545+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages

  1. [1]

    Tabpfn: A transformer that solves small tabular classification problems in a second

    Noah Hollmann, Samuel M ¨uller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second. InNeurIPS, First Table Representation Workshop, 2022

  2. [2]

    Conditional logit analysis of qualitative choice behavior.Frontiers in Econometrics, 1974

    Daniel McFadden. Conditional logit analysis of qualitative choice behavior.Frontiers in Econometrics, 1974

  3. [3]

    The measurement of urban travel demand.Journal of public economics, 3(4):303–328, 1974

    Daniel McFadden. The measurement of urban travel demand.Journal of public economics, 3(4):303–328, 1974

  4. [4]

    MIT press, 1992

    Simon P Anderson, Andre De Palma, and Jacques-Francois Thisse.Discrete choice theory of product differentiation. MIT press, 1992

  5. [5]

    Multiproduct price optimization and competition under the nested logit model with product-differentiated price sensitivities.Operations Research, 62(2):450–461, 2014

    Guillermo Gallego and Ruxian Wang. Multiproduct price optimization and competition under the nested logit model with product-differentiated price sensitivities.Operations Research, 62(2):450–461, 2014

  6. [6]

    Multiproduct pricing under the generalized extreme value models with homogeneous price sensitivity parameters.Operations Research, 66(6):1559–1570, 2018

    Heng Zhang, Paat Rusmevichientong, and Huseyin Topaloglu. Multiproduct pricing under the generalized extreme value models with homogeneous price sensitivity parameters.Operations Research, 66(6):1559–1570, 2018

  7. [7]

    A data-driven approach to modeling choice.Advances in Neural Information Processing Systems, 22, 2009

    Vivek Farias, Srikanth Jagabathula, and Devavrat Shah. A data-driven approach to modeling choice.Advances in Neural Information Processing Systems, 22, 2009

  8. [8]

    A nonparametric approach to modeling choice with limited data.Management science, 59(2):305–322, 2013

    Vivek F Farias, Srikanth Jagabathula, and Devavrat Shah. A nonparametric approach to modeling choice with limited data.Management science, 59(2):305–322, 2013

  9. [9]

    A markov chain approximation to choice modeling.Operations Research, 64(4):886–905, 2016

    Jose Blanchet, Guillermo Gallego, and Vineet Goyal. A markov chain approximation to choice modeling.Operations Research, 64(4):886–905, 2016

  10. [10]

    Routledge, 2016

    Laurie A Garrow.Discrete choice modelling and air travel demand: theory and applications. Routledge, 2016

  11. [11]

    Pricing personalized bundles: A new approach and an empirical study.Manufacturing & Service Operations Management, 18(1):51–68, 2016

    Zhengliang Xue, Zizhuo Wang, and Markus Ettl. Pricing personalized bundles: A new approach and an empirical study.Manufacturing & Service Operations Management, 18(1):51–68, 2016

  12. [12]

    Constrained prescriptive trees via column generation

    Shivaram Subramanian, Wei Sun, Youssef Drissi, and Markus Ettl. Constrained prescriptive trees via column generation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36(4), pages 4602–4610, 2022

  13. [13]

    Bounds and heuristics for multiproduct pricing

    Guillermo Gallego and Gerardo Berbeglia. Bounds and heuristics for multiproduct pricing. Management Science, 70(6):4132–4144, 2024

  14. [14]

    Baichuan Mo, Qingyi Wang, Xiaotong Guo, Matthias Winkenbach, and Jinhua Zhao. Predicting drivers’ route trajectories in last-mile delivery using a pair-wise attention-based pointer neural network.Transportation Research Part E: Logistics and Transportation Review, 175:103168, 2023

  15. [15]

    Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation

    Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. InProceedings of the 2021 conference on empirical methods in natural language processing, pages 8696–8708, 2021. 10

  16. [16]

    Starcoder: may the source be with you!Transactions on Machine Learning Research, 2023

    Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennigho, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, et al. Starcoder: may the source be with you!Transactions on Machine Learning Research, 2023

  17. [17]

    Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

    Noah Hollmann, Samuel M¨uller, Lennart Purucker, Arjun Krishnakumar, Max K¨orfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

  18. [18]

    Transformers can do bayesian inference

    Samuel M¨uller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, and Frank Hutter. Transformers can do bayesian inference. InInternational Conference on Learning Representa- tions, 2022

  19. [19]

    Representing random utility choice models with neural networks

    Ali Aouad and Antoine D´esir. Representing random utility choice models with neural networks. Management Science, 2026

  20. [20]

    On the power of foundation models

    Yang Yuan. On the power of foundation models. InInternational conference on machine learning, pages 40519–40530. PMLR, 2023

  21. [21]

    Optnet: Differentiable optimization as a layer in neural networks

    Brandon Amos and J Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. InInternational conference on machine learning, pages 136–145. PMLR, 2017

  22. [22]

    Elsevier, 2019

    Ievgen Redko, Emilie Morvant, Amaury Habrard, Marc Sebban, and Younes Bennani.Advances in domain adaptation theory. Elsevier, 2019

  23. [23]

    Springer, 2014

    Charles A Rohde et al.Introductory statistical inference with the likelihood function. Springer, 2014

  24. [24]

    A theory of learning from different domains.Machine learning, 79(1):151–175, 2010

    Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jen- nifer Wortman Vaughan. A theory of learning from different domains.Machine learning, 79(1):151–175, 2010

  25. [25]

    On large-batch training for deep learning: Generalization gap and sharp minima

    Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On large-batch training for deep learning: Generalization gap and sharp minima. InInternational Conference on Learning Representations, 2017

  26. [26]

    Fantastic generalization measures and where to find them

    Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, and Samy Bengio. Fantastic generalization measures and where to find them. InInternational Conference on Learning Representations, 2020

  27. [27]

    Avoiding spurious sharpness minimization broadens applicability of sam

    Sidak Pal Singh, Hossein Mobahi, Atish Agarwala, and Yann Dauphin. Avoiding spurious sharpness minimization broadens applicability of sam. InInternational Conference on Machine Learning, pages 55702–55719. PMLR, 2025

  28. [28]

    Entropy-sgd optimizes the prior of a pac-bayes bound: Generalization properties of entropy-sgd and data-dependent priors

    Gintare Karolina Dziugaite and Daniel Roy. Entropy-sgd optimizes the prior of a pac-bayes bound: Generalization properties of entropy-sgd and data-dependent priors. InInternational Conference on Machine Learning, pages 1377–1386. PMLR, 2018

  29. [29]

    Estimation of choice-based models using sales data from a single firm.Manufacturing & Service Operations Management, 16(2):184–197, 2014

    Jeffrey P Newman, Mark E Ferguson, Laurie A Garrow, and Timothy L Jacobs. Estimation of choice-based models using sales data from a single firm.Manufacturing & Service Operations Management, 16(2):184–197, 2014

  30. [30]

    The acceptance of modal innovation: The case of swissmetro.Swiss Transport Research Conference, 2001

    Michel Bierlaire, Kay W Axhausen, and Georg Abay. The acceptance of modal innovation: The case of swissmetro.Swiss Transport Research Conference, 2001

  31. [31]

    A random-coefficients logit brand-choice model applied to panel data.Journal of Business & Economic Statistics, 12(3):317– 328, 1994

    Dipak C Jain, Naufel J Vilcassim, and Pradeep K Chintagunta. A random-coefficients logit brand-choice model applied to panel data.Journal of Business & Economic Statistics, 12(3):317– 328, 1994

  32. [32]

    Microeconomics by Robert S

    Ante Babi´c. Microeconomics by Robert S. Pindyck and Daniel L. Rubinfeld.Financial theory and practice, 29(4):385–386, 2005

  33. [33]

    Elsevier, 1998

    N Gregory Mankiw.Principles of microeconomics, volume 1. Elsevier, 1998. 11

  34. [34]

    Customized regression model for airbnb dynamic pricing

    Peng Ye, Julian Qian, Jieying Chen, Chen-hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. Customized regression model for airbnb dynamic pricing. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 932–940, 2018

  35. [35]

    Pricing frictions and platform remedies: The case of airbnb.Marketing Science, 41(6):1085–1108, 2022

    Davide Proserpio, Meng Xu, and Georgios Zervas. Pricing frictions and platform remedies: The case of airbnb.Marketing Science, 41(6):1085–1108, 2022

  36. [36]

    https://www.ftc.gov/news-events/news/ press-releases/2025/01/ftc-surveillance-pricing-study-indicates-wide- range-personal-data-used-set-individualized-consumer, 2025

    Federal Trade Commission Ftc surveillance pricing study indicates wide range of personal data used to set individualized consumer prices. https://www.ftc.gov/news-events/news/ press-releases/2025/01/ftc-surveillance-pricing-study-indicates-wide- range-personal-data-used-set-individualized-consumer, 2025

  37. [37]

    P. Langley. Crafting papers on machine learning. In Pat Langley, editor,Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pages 1207–1216, Stanford, CA, 2000

  38. [38]

    Parkin, M

    M. Parkin, M. Powell, and K. Matthews.Economics. Addison-Wesley, 2008

  39. [39]

    Stochastic approximation with two time scales.Systems & Control Letters, 29(5):291–294, 1997

    Vivek S Borkar. Stochastic approximation with two time scales.Systems & Control Letters, 29(5):291–294, 1997

  40. [40]

    An introduction to bilevel optimization: Foundations and applications in signal processing and machine learning.IEEE Signal Process

    Yihua Zhang, Prashant Khanduri, Ioannis C Tsaknakis, Yuguang Yao, Mingyi Hong, and Sijia Liu. An introduction to bilevel optimization: Foundations and applications in signal processing and machine learning.IEEE Signal Process. Mag., 2024

  41. [41]

    Justifying recommendations using distantly- labeled reviews and fine-grained aspects

    Jianmo Ni, Jiacheng Li, and Julian McAuley. Justifying recommendations using distantly- labeled reviews and fine-grained aspects. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors,Proceedings of the 2019 Conference on Empirical Methods in Natural Lan- guage Processing and the 9th International Joint Conference on Natural Language Processin...

  42. [42]

    The instacart online grocery shopping dataset 2017

    Instacart. The instacart online grocery shopping dataset 2017. https://www.instacart.com/ datasets/grocery-shopping-2017, 2017

  43. [43]

    Uber and lyft dataset boston, ma

    Brllrb. Uber and lyft dataset boston, ma. https://www.kaggle.com/datasets/brllrb/ uber-and-lyft-dataset-boston-ma, 2019

  44. [44]

    Tabpfgen– tabular data generation with tabpfn

    Junwei Ma, Apoorv Dankar, George Stein, Guangwei Yu, and Anthony Caterini. Tabpfgen– tabular data generation with tabpfn. InNeurIPS, Second Table Representation Learning Workshop, 2023

  45. [45]

    Samuelson and William D

    Paul A. Samuelson and William D. Nordhaus.Economics. McGraw-Hill, 19th edition, 2009

  46. [46]

    All beta_i must be negative

    Paul Milgrom.Putting Auction Theory to Work. Cambridge University Press, 2004. 12 A Theoretical Development A.1 Methodologies for Decision Model under Discrete Choice A.1.1 Normalization of the Decision Variables Normalization is important both for the optimization procedure and for obtaining meaningful insights into the resulting optimal decisions, and i...

  47. [47]

    +Kcolumns. We generate prices p for the ’what-if’ data by sampling from a normal distribution with mean and standard deviation equal to 1.0, and clip the sampled values to lie within the interval [0,2] . This design restricts prices to a narrow range around the mean, reflecting real-world pricing distributions 19 Table 8: ICL-OFF ablation results. PDR/PIR...