pith. sign in

arxiv: 2606.30395 · v1 · pith:4HJC3FOXnew · submitted 2026-06-29 · 💻 cs.CY · cs.CL· cs.SI

Uncovering Salience-Driven Dynamics in Consumer Confidence with Generative Social Simulation

Pith reviewed 2026-06-30 04:08 UTC · model grok-4.3

classification 💻 cs.CY cs.CLcs.SI
keywords consumer confidencegenerative simulationsynthetic populationsaliencebehavioral modelingeconomic indicatorsforecasting
0
0 comments X

The pith

Generative simulation of household responses reconstructs official consumer confidence indices more accurately than baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that consumer confidence is not just a persistent macroeconomic index but results from households interpreting signals through their own constraints and beliefs. ConsumerSim generates these responses in a synthetic population calibrated to microdata, incorporating time-stamped economic signals, survey-style answering, post-stratification, and inertia. It achieves better reconstruction of U.S., EU27, and Japanese CCI series than persistence, time-series, regression, and augmented baselines, especially around high-salience shocks. The output also boosts short-horizon forecasts of real activity, particularly housing. Ablation tests confirm that aggregation, signals, heterogeneity, and inertia are all required for the performance.

Core claim

ConsumerSim reconstructs CCI dynamics from a microdata-calibrated synthetic population using time-stamped macroeconomic, financial, policy, and news signals, survey-like response generation, post-stratified belief expansion, and behavioral inertia alignment. It ranks first on reconstruction metrics across U.S., EU27, and Japanese series with gains around high-salience shocks and improves prediction of real activity, most consistently for housing. Mechanism analyses indicate CCI movements concentrate around salient events, subgroups align in direction but differ in magnitude, and sensitivity varies by group.

What carries the argument

ConsumerSim, a generative Human-Environment response framework that simulates individual responses to signals in a heterogeneous synthetic population and aggregates them with post-stratification and inertia.

If this is right

  • CCI movements concentrate around salient events.
  • Subgroup trajectories often align in direction while differing in magnitude across income, homeownership, education, and political-alignment groups.
  • Signal sensitivity varies across these groups.
  • The reconstructed signal improves short-horizon prediction of real activity, most consistently for housing outcomes.
  • Population-expansion and ablation results show representative aggregation, situational signals, persona heterogeneity, and inertia are necessary for accuracy and diagnosis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If valid, the framework could support counterfactual analysis of how altering the salience of policy news would affect confidence levels.
  • Extending it to real-time data streams might enable nowcasting of confidence shifts before official releases.
  • Similar generative approaches could be applied to model other economic sentiment measures like business confidence.
  • The emphasis on salience suggests that media coverage intensity, not just content, plays a key role in driving aggregate confidence.

Load-bearing premise

The microdata-calibrated synthetic population and the rules for generating and expanding responses produce aggregate CCI that matches real movements for reasons other than parameter tuning to the target series itself.

What would settle it

Applying the model to a hold-out period after calibration and finding that reconstruction metrics fall below the best information-augmented baseline would falsify the performance claim.

read the original abstract

Consumer confidence is typically modeled as a persistent macroeconomic index, yet its movements arise from households that interpret economic information through heterogeneous constraints, exposures, prior beliefs, and attention. We introduce ConsumerSim, a generative Human--Environment response framework that reconstructs Consumer Confidence Index (CCI) dynamics from a microdata-calibrated synthetic population, time-stamped macroeconomic, financial, policy, and news signals, survey-like response generation, post-stratified belief expansion, and behavioral inertia alignment. Across U.S., EU27, and Japanese official CCI target series, ConsumerSim ranks first among persistence, time-series, regression, and information-augmented baselines on the reported reconstruction metrics, with clear gains around high-salience shocks. Its reconstructed signal also improves short-horizon prediction of real activity, most consistently for housing outcomes. Mechanism analyses show that CCI movements concentrate around salient events; subgroup trajectories often align in direction while differing in magnitude; and signal sensitivity varies across income, homeownership, education, and political-alignment groups. Population-expansion and ablation results indicate that representative aggregation, situational signals, persona heterogeneity, and inertia are necessary for both accuracy and diagnosis. The findings support a behavioral view of consumer confidence as an interpretable Human--Environment response process rather than a purely aggregate time series.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces ConsumerSim, a generative Human-Environment response framework that reconstructs Consumer Confidence Index (CCI) dynamics from a microdata-calibrated synthetic population, time-stamped macroeconomic, financial, policy, and news signals, survey-like response generation, post-stratified belief expansion, and behavioral inertia alignment. Across U.S., EU27, and Japanese official CCI target series, it claims ConsumerSim ranks first among persistence, time-series, regression, and information-augmented baselines on reconstruction metrics, with gains around high-salience shocks. The reconstructed signal also improves short-horizon prediction of real activity (most consistently for housing), and mechanism analyses show CCI movements concentrate around salient events, with subgroup trajectories aligning in direction but differing in magnitude, and varying sensitivity across income, homeownership, education, and political-alignment groups. Ablation results indicate representative aggregation, situational signals, persona heterogeneity, and inertia are necessary for accuracy.

Significance. If the reconstruction is shown to arise from independent micro mechanisms rather than calibration to the target series, the work would advance the field by supplying an interpretable, micro-founded alternative to aggregate time-series models of consumer confidence. The multi-country application, ablation studies demonstrating component necessity, and downstream links to real-activity prediction would be notable strengths. The emphasis on salience-driven dynamics and demographic heterogeneity offers diagnostic value beyond black-box forecasting.

major comments (2)
  1. Abstract: The abstract asserts first-place ranking and predictive gains but supplies no numerical metrics, error bars, baseline definitions, or validation details; reconstruction performance cannot be assessed from the given text.
  2. Calibration description (Methods): The framework is explicitly calibrated to microdata to reconstruct the CCI target series; without evidence that the calibration targets (microdata alignment, belief expansion weights, inertia parameters) are independent of the CCI itself, the reported reconstruction superiority is consistent with fitting rather than out-of-sample validation. This is load-bearing for the central claim of generative validity from emergent micro mechanisms.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of results and the distinction between calibration and validation. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: Abstract: The abstract asserts first-place ranking and predictive gains but supplies no numerical metrics, error bars, baseline definitions, or validation details; reconstruction performance cannot be assessed from the given text.

    Authors: We agree that the abstract would be strengthened by including quantitative details. In the revised version we will add the key reconstruction metrics (e.g., ranking and relative error reductions versus baselines), a concise definition of the baseline set, and a brief statement of the validation procedure. revision: yes

  2. Referee: Calibration description (Methods): The framework is explicitly calibrated to microdata to reconstruct the CCI target series; without evidence that the calibration targets (microdata alignment, belief expansion weights, inertia parameters) are independent of the CCI itself, the reported reconstruction superiority is consistent with fitting rather than out-of-sample validation. This is load-bearing for the central claim of generative validity from emergent micro mechanisms.

    Authors: Calibration is performed exclusively on micro-level data (individual survey responses and demographic distributions drawn from sources such as the Survey of Consumer Expectations and census statistics) that pre-date and are statistically independent of the aggregate CCI series. The CCI is used solely as the out-of-sample validation target against which emergent macro dynamics are evaluated. We will add an explicit subsection in Methods documenting the calibration data sources, the precise calibration objectives, and their separation from the CCI target. The existing ablation results further support that performance depends on the micro mechanisms rather than direct fitting to the aggregate series. revision: yes

Circularity Check

0 steps flagged

No significant circularity; reconstruction framed as emergent from microdata calibration

full rationale

The provided abstract and description frame ConsumerSim as a generative framework that reconstructs CCI from a microdata-calibrated synthetic population plus signals, survey-like generation, post-stratification, and inertia alignment. No equations, self-citations, or explicit fitting steps are quoted that reduce the reported reconstruction metrics or salience gains to the CCI target series by construction. Calibration is described as microdata-driven rather than tuned to minimize error on the U.S./EU27/Japan CCI targets themselves. The central claim therefore retains independent content from the micro mechanisms and does not trigger any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be extracted or verified from the provided text.

pith-pipeline@v0.9.1-grok · 5787 in / 1212 out tokens · 56349 ms · 2026-06-30T04:08:09.953898+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

76 extracted references · 8 canonical work pages

  1. [1]

    McGraw-Hill, New York (1951)

    Katona, G.: Psychological Analysis of Economic Behavior. McGraw-Hill, New York (1951)

  2. [2]

    Elsevier, New York (1975)

    Katona, G.: Psychological Economics. Elsevier, New York (1975). Chapter 5 and 6 describe the construction of the Index of Consumer Sentiment

  3. [3]

    Dominitz, J., Manski, C.F.: How should we measure consumer confidence? Journal of Economic Perspectives18(2), 51–66 (2004)

  4. [4]

    Econometrica72(5), 1329–1376 (2004)

    Manski, C.F.: Measuring expectations. Econometrica72(5), 1329–1376 (2004)

  5. [5]

    NBER Working Paper No

    Souleles, N.S.: Consumer Sentiment: Its Rationality and Usefulness in Forecasting Expenditure—Evidence from the Michigan Micro Data. NBER Working Paper No. 8410 (2001)

  6. [6]

    Brookings Pap

    Howrey, E.P.: The predictive power of the index of consumer sentiment. Brookings Pap. Econ. Act.2001(1), 175–207 (2001)

  7. [7]

    The Quarterly Journal of Economics118(1), 269–298 (2003)

    Carroll, C.D.: Macroeconomic expectations of households and professional fore- casters. The Quarterly Journal of Economics118(1), 269–298 (2003)

  8. [8]

    Journal of Economic Perspectives18(2), 29–50 (2004)

    Ludvigson, S.C.: Consumer confidence and consumer spending. Journal of Economic Perspectives18(2), 29–50 (2004)

  9. [9]

    American Economic Review102(4), 1343–1377 (2012)

    Barsky, R.B., Sims, E.R.: Information, animal spirits, and the meaning of inno- vations in consumer confidence. American Economic Review102(4), 1343–1377 (2012)

  10. [10]

    Survey Research Center, Institute for Social Research, University of Michigan

    University of Michigan Surveys of Consumers: Surveys of Consumers: Index of Consumer Sentiment. Survey Research Center, Institute for Social Research, University of Michigan. https://www.sca.isr.umich.edu/ (2026)

  11. [11]

    Survey Research Center, Institute for Social Research, University of Michigan

    Surveys of Consumers: Technical Documentation for the 2024 Methodologi- cal Transition to Web Surveys. Survey Research Center, Institute for Social Research, University of Michigan. https://data.sca.isr.umich.edu/technical-docs. php (2024)

  12. [12]

    Directorate-General for Economic and Financial Affairs

    European Commission: Business and consumer surveys. Directorate-General for Economic and Financial Affairs. https://economy-finance.ec.europa.eu/ economic-forecast-and-surveys/business-and-consumer-surveys en (2025)

  13. [13]

    Euro indicators, European Com- mission

    Eurostat: Business and Consumer Surveys. Euro indicators, European Com- mission. https://ec.europa.eu/eurostat/web/euro-indicators/information-data/ business-consumer-surveys (2026)

  14. [14]

    Technical report, European Commission (2001)

    Goldrian, G., Lindbauer, J.D., Nerb, G., Ulrich, B.: Evaluation and development 30 of confidence indicators based on harmonised business and consumer surveys. Technical report, European Commission (2001)

  15. [15]

    Economic and Social Research Institute, Cabinet Office

    Economic and Social Research Institute, Cabinet Office, Government of Japan: Consumer Confidence Survey. Economic and Social Research Institute, Cabinet Office. https://www.esri.cao.go.jp/en/stat/shouhi/shouhi-e.html (2026)

  16. [16]

    Journal of Money, Credit and Banking36(1), 39–72 (2004)

    Souleles, N.S.: Expectations, heterogeneous forecast errors, and consumption: Micro evidence from the Michigan consumer sentiment surveys. Journal of Money, Credit and Banking36(1), 39–72 (2004)

  17. [17]

    Wiley, Hoboken, NJ (2008)

    Box, G.E.P., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley, Hoboken, NJ (2008)

  18. [18]

    inflation become harder to forecast? Journal of Money, Credit and Banking39(1), 3–33 (2007)

    Stock, J.H., Watson, M.W.: Why has U.S. inflation become harder to forecast? Journal of Money, Credit and Banking39(1), 3–33 (2007)

  19. [19]

    Journal of Econometrics228(2), 221–243 (2022)

    Shapiro, A.H., Sudhof, M., Wilson, D.J.: Measuring news sentiment. Journal of Econometrics228(2), 221–243 (2022)

  20. [20]

    Axioms12(9), 835 (2023) https://doi.org/10

    Kim, J., Kim, H.-S., Choi, S.-Y.: Forecasting the S&P 500 index using mathematical-based sentiment analysis and deep learning models: A FinBERT transformer model and LSTM. Axioms12(9), 835 (2023) https://doi.org/10. 3390/axioms12090835

  21. [21]

    MIT Press, Cambridge, MA (1996)

    Epstein, J.M., Axtell, R.: Growing Artificial Societies: Social Science from the Bottom Up. MIT Press, Cambridge, MA (1996). Foundation for agent-based modeling of emergent collective phenomena

  22. [22]

    Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J.,et al.: A standard protocol for describing individual-based and agent-based models. Ecol. Model.198(1–2), 115–126 (2006) https://doi.org/10.1016/j.ecolmodel.2006.04. 023

  23. [23]

    Gao, C., Lan, X., Li, N., Yuan, Y., Ding, J., Zhou, Z., Xu, F., Li, Y.: Large language models empowered agent-based modeling and simulation: A survey and perspectives. Humanit. Soc. Sci. Commun.11(1), 1259 (2024) https://doi.org/ 10.1057/s41599-024-03611-3

  24. [24]

    arXiv preprint (2025) https://doi.org/10.48550/arXiv.2504.10157 arXiv:2504.10157

    Zhang, X., Lin, J., Mou, X., Yang, S., Liu, X., Sun, L., Lyu, H., Yang, Y., Qi, W., Chen, Y.,et al.: SocioVerse: A world model for social simulation powered by LLM agents and a pool of 10 million real-world users. arXiv preprint (2025) https://doi.org/10.48550/arXiv.2504.10157 arXiv:2504.10157. https://arxiv.org/ abs/2504.10157

  25. [25]

    McGraw-Hill, New York (1936) 31

    Lewin, K.: Principles of Topological Psychology. McGraw-Hill, New York (1936) 31

  26. [26]

    Science185(4157), 1124–1131 (1974)

    Tversky, A., Kahneman, D.: Judgment under uncertainty: Heuristics and biases. Science185(4157), 1124–1131 (1974)

  27. [27]

    Journal of Personality and Social Psychology37(11), 2098–2109 (1979)

    Lord, C.G., Ross, L., Lepper, M.R.: Biased assimilation and attitude polariza- tion: The effects of prior theories on subsequently considered evidence. Journal of Personality and Social Psychology37(11), 2098–2109 (1979)

  28. [28]

    Review of General Psychology2(2), 175–220 (1998)

    Nickerson, R.S.: Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology2(2), 175–220 (1998)

  29. [29]

    Nature Communications11(1), 2634 (2020)

    Rollwage, M., Loosen, A., Hauser, T.U., Moran, R., Dolan, R.J., Fleming, S.M.: Confidence drives a neural confirmation bias. Nature Communications11(1), 2634 (2020)

  30. [30]

    American Economic Review110(8), 2748–2782 (2020)

    Bordalo, P., Gennaioli, N., Ma, Y., Shleifer, A.: Overreaction in macroeconomic expectations. American Economic Review110(8), 2748–2782 (2020)

  31. [31]

    Annual Review of Economics 14, 521–544 (2022)

    Bordalo, P., Gennaioli, N., Shleifer, A.: Salience. Annual Review of Economics 14, 521–544 (2022)

  32. [32]

    Review of Financial Studies33(1), 395–432 (2020)

    Das, S., Kuhnen, C.M., Nagel, S.: Socioeconomic status and macroeconomic expectations. Review of Financial Studies33(1), 395–432 (2020)

  33. [33]

    Quarterly Journal of Economics131(1), 53–87 (2016)

    Malmendier, U., Nagel, S.: Learning from inflation experiences. Quarterly Journal of Economics131(1), 53–87 (2016)

  34. [34]

    American Economic Review 105(8), 2644–2678 (2015)

    Coibion, O., Gorodnichenko, Y.: Information rigidity and the expectations for- mation process: A simple framework and new facts. American Economic Review 105(8), 2644–2678 (2015)

  35. [35]

    Census Bureau: Survey of Income and Program Participation (SIPP)

    U.S. Census Bureau: Survey of Income and Program Participation (SIPP). U.S. Department of Commerce. https://www.census.gov/programs-surveys/sipp.html (2024)

  36. [36]

    NORC at the University of Chicago, Chicago, IL

    Davern, M., Bautista, R., Freese, J., Herd, P., Morgan, S.L.: General Social Survey, 1972–2024. NORC at the University of Chicago, Chicago, IL. Prin- cipal Investigator, Michael Davern; Co-Principal Investigators, Rene Bautista, Jeremy Freese, Pamela Herd, and Stephen L. Morgan. Sponsored by National Science Foundation. Data accessed from the GSS Data Exp...

  37. [37]

    John Wiley & Sons, New York (1965)

    Kish, L.: Survey Sampling. John Wiley & Sons, New York (1965)

  38. [38]

    Journal of the American Statistical Association88(423), 1001–1012 (1993)

    Little, R.J.: Post-stratification: A modeler’s perspective. Journal of the American Statistical Association88(423), 1001–1012 (1993)

  39. [39]

    Survey Methodology23(2), 127–135 (1997) 32

    Gelman, A., Little, T.C.: Poststratification into many categories using hierarchical logistic regression. Survey Methodology23(2), 127–135 (1997) 32

  40. [40]

    Louis: FRED: Federal Reserve Economic Data

    Federal Reserve Bank of St. Louis: FRED: Federal Reserve Economic Data. Federal Reserve Bank of St. Louis. https://fred.stlouisfed.org/ (2024)

  41. [41]

    Bureau of Labor Statistics: Economic Indicators and Labor Statistics. U.S. Department of Labor. Accessed: 2024 (2024)

  42. [42]

    Department of the Treasury: Treasury Yield Data

    U.S. Department of the Treasury: Treasury Yield Data. https://home.treasury. gov/data. Accessed: 2024 (2024)

  43. [43]

    Center for Microeconomic Data

    Federal Reserve Bank of New York: Survey of Consumer Expectations (SCE). Center for Microeconomic Data. https://www.newyorkfed.org/microeconomics/ sce (2024)

  44. [44]

    https://developer.nytimes.com/docs/ archive-product/1/overview

    The New York Times: Archive API. https://developer.nytimes.com/docs/ archive-product/1/overview. Accessed: 2024 (2024)

  45. [45]

    https://ustr.gov

    Office of the United States Trade Representative: Section 301 Tariff Data and Trade Policy Signals. https://ustr.gov. Accessed: 2024 (2024)

  46. [46]

    https://finance.yahoo.com

    Yahoo Finance: Financial Market Indicators. https://finance.yahoo.com. Accessed: 2024 (2024)

  47. [47]

    Journal of Monetary Economics 50(3), 665–690 (2003)

    Sims, C.A.: Implications of rational inattention. Journal of Monetary Economics 50(3), 665–690 (2003)

  48. [48]

    The Quarterly Journal of Economics 117(4), 1295–1328 (2002)

    Mankiw, N.G., Reis, R.: Sticky information versus sticky prices: A proposal to replace the new Keynesian Phillips curve. The Quarterly Journal of Economics 117(4), 1295–1328 (2002)

  49. [49]

    OpenAI: GPT-4o System Card. OpenAI. https://openai.com/index/ gpt-4o-system-card/ (2024)

  50. [50]

    Chapman & Hall/CRC, Boca Raton, FL (2006)

    Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analy- sis, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL (2006). Chapter 5: Dirichlet-Multinomial models for survey data expansion

  51. [51]

    The Quarterly Journal of Economics69(1), 99–118 (1955)

    Simon, H.A.: A behavioral model of rational choice. The Quarterly Journal of Economics69(1), 99–118 (1955)

  52. [52]

    Prentice-Hall, Englewood Cliffs, NJ (1973)

    Kahneman, D.: Attention and Effort. Prentice-Hall, Englewood Cliffs, NJ (1973)

  53. [53]

    Cognitive Psychology5(2), 207–232 (1973)

    Tversky, A., Kahneman, D.: Availability: A heuristic for judging frequency and probability. Cognitive Psychology5(2), 207–232 (1973)

  54. [54]

    Prague Econ

    V´ acha, L., Barunik, J., Voˇ svrda, M.: Smart agents and sentiment in the hetero- geneous agent model. Prague Econ. Pap.18(3), 209–219 (2009) https://doi.org/ 10.18267/j.pep.350 33

  55. [55]

    Physica A392(3), 592–600 (2012) https://doi.org/10.1016/j.physa.2012.07.061

    Kukaˇ cka, J., Barunik, J., V´ acha, L.: Behavioural breaks in the heterogeneous agent model: The impact of herding, overconfidence, and market sentiment. Physica A392(3), 592–600 (2012) https://doi.org/10.1016/j.physa.2012.07.061

  56. [56]

    Drevet, J., Drugowitsch, J., Wyart, V.: Efficient stabilization of imprecise statisti- cal inference through conditional belief updating. Nat. Hum. Behav.6, 1691–1704 (2022) https://doi.org/10.1038/s41562-022-01445-0

  57. [57]

    In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST ’23), pp

    Park, J.S., O’Brien, J.C., Cai, C.J., Morris, M.R., Liang, P., Bernstein, M.S.: Generative agents: Interactive simulacra of human behavior. In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST ’23), pp. 1–22. ACM, New York, NY (2023). https://doi.org/10.1145/ 3586183.3606763

  58. [58]

    Political Analysis31(3), 337–351 (2023)

    Argyle, L.P., Busby, E.C., Fulda, N., Gubler, J.R., Rytting, C., Wingate, D.: Out of one, many: Using language models to simulate human samples. Political Analysis31(3), 337–351 (2023)

  59. [59]

    In: International Conference on Machine Learning, pp

    Aher, G.V., Arriaga, R.I., Kalai, A.T.: Using large language models to sim- ulate multiple humans and replicate human subject studies. In: International Conference on Machine Learning, pp. 337–371 (2023). PMLR

  60. [60]

    https://doi.org/10.3386/w31122

    Horton, J.J., Filippas, A., Manning, B.S.: Large language models as simulated economic agents: What can we learn from homo silicus? Technical Report 31122, National Bureau of Economic Research (2023). https://doi.org/10.3386/w31122

  61. [61]

    arXiv preprint (2024) arXiv:2410.20746

    Zhang, X., Lin, J., Sun, L., Qi, W., Yang, Y., Chen, Y., Lyu, H., Mou, X., Chen, S., Luo, J., et al.: ElectionSim: Massive population election simulation powered by large language model driven agents. arXiv preprint (2024) arXiv:2410.20746. https://arxiv.org/abs/2410.20746

  62. [62]

    arXiv preprint (2023) arXiv:2312.11465

    Hua, W., Fan, L., Li, L., et al.: Revisiting the simulation of human society: Large language model-based agent society and its emergent behaviors. arXiv preprint (2023) arXiv:2312.11465. https://arxiv.org/abs/2312.11465

  63. [63]

    arXiv preprint (2025) arXiv:2502.05797

    Warnakulasuriya, T., et al.: Mind the gap: Bridging the divide between human and LLM agents. arXiv preprint (2025) arXiv:2502.05797. https://arxiv.org/abs/ 2502.05797

  64. [64]

    arXiv preprint (2025) arXiv:2505.17648

    Lin, J., Sun, L., Yan, Y.: Simulating macroeconomic expectations using LLM agents. arXiv preprint (2025) arXiv:2505.17648. https://arxiv.org/abs/2505. 17648

  65. [65]

    Nature Reviews Psychology2(11), 688–701 (2023) 34

    Demszky, D., Yang, D., Yeager, D.S., Bryan, C.J., Clapper, M., Chandhok, S., Eichstaedt, J.C., Hecht, C., Jamieson, J., Johnson, M.,et al.: Using large language models in psychology. Nature Reviews Psychology2(11), 688–701 (2023) 34

  66. [66]

    Dillion, D., Tandon, N., Gu, Y., Gray, K.: Can ai language models replace human participants? Trends in Cognitive Sciences27(7), 597–600 (2023)

  67. [67]

    Proceedings of the National Academy of Sciences122, 24 (June 2025), e2501660122

    Gao, Y., Lee, D., Burtch, G.D., Fazelpour, S.: Take caution in using LLMs as human surrogates. Proceedings of the National Academy of Sciences122(24), 2501660122 (2025) https://doi.org/10.1073/pnas.2501660122

  68. [68]

    Technometrics12(1), 55–67 (1970)

    Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthog- onal problems. Technometrics12(1), 55–67 (1970)

  69. [69]

    OTexts, Melbourne, Australia (2018)

    Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice, 2nd edn. OTexts, Melbourne, Australia (2018)

  70. [70]

    The Quarterly Journal of Economics120(1), 387–422 (2005)

    Bernanke, B.S., Boivin, J., Eliasz, P.: Measuring the effects of monetary pol- icy: a factor-augmented vector autoregressive (FAVAR) approach. The Quarterly Journal of Economics120(1), 387–422 (2005)

  71. [71]

    Oxford University Press, ??? (2012)

    Durbin, J., Koopman, S.J.: Time Series Analysis by State Space Methods, 2nd edn. Oxford University Press, ??? (2012)

  72. [72]

    Journal of Forecast- ing4(1), 1–28 (1985)

    Gardner, E.S.: Exponential smoothing: The state of the art. Journal of Forecast- ing4(1), 1–28 (1985)

  73. [73]

    Springer, ??? (2008)

    Hyndman, R., Koehler, A.B., Ord, J.K., Snyder, R.D.: Forecasting with Expo- nential Smoothing: The State Space Approach. Springer, ??? (2008)

  74. [74]

    International Journal of Forecasting16(4), 521–530 (2000)

    Assimakopoulos, V., Nikolopoulos, K.: The theta model: a decomposition approach to forecasting. International Journal of Forecasting16(4), 521–530 (2000)

  75. [75]

    Journal of Statistical Software27(3), 1–22 (2008)

    Hyndman, R.J., Khandakar, Y.: Automatic time series forecasting: the forecast package for r. Journal of Statistical Software27(3), 1–22 (2008)

  76. [76]

    The American Statistician72(1), 37–45 (2018) 35

    Taylor, S.J., Letham, B.: Forecasting at scale. The American Statistician72(1), 37–45 (2018) 35