pith. sign in

arxiv: 2510.01297 · v3 · submitted 2025-10-01 · 💻 cs.MA

SimCity: Multi-Agent Urban Development Simulation with Rich Interactions

Pith reviewed 2026-05-18 10:56 UTC · model grok-4.3

classification 💻 cs.MA
keywords multi-agent simulationlarge language modelsmacroeconomic modelingurban developmentagent-based modelseconomic patternsvirtual city
0
0 comments X

The pith

LLM agents in a multi-agent city simulation reproduce Okun's Law, the Phillips Curve, and other empirical economic patterns through natural-language reasoning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SimCity, a framework in which large language models direct the decisions of heterogeneous agents including households, firms, a central bank, and a government. These agents interact across a frictional labor market, a market for varied goods, and a financial market, while a vision-language model selects locations for new firms and draws the resulting city map. The central demonstration is that this approach generates several well-documented macroeconomic regularities without any manually written behavioral rules. The patterns appear consistently across separate simulation runs, offering a way to study both economy-wide trends and urban growth in the same setting.

Core claim

SimCity shows that LLM agents with transparent natural-language deliberation can participate in frictional labor, heterogeneous goods, and financial markets while a VLM handles geographic firm placement, producing price elasticity of demand, Engel's Law, Okun's Law, the Phillips Curve, and the Beveridge Curve in a manner that stays stable across runs.

What carries the argument

LLM agents that reason about economic choices in natural language, together with VLM-driven placement of firms on a virtual map.

If this is right

  • The same environment can be used to track both macroeconomic regularities and the spatial expansion of the virtual city.
  • Agent decisions remain interpretable because each choice comes with explicit natural-language reasoning.
  • The framework avoids the need to pre-specify fixed rules for every possible market interaction.
  • Results hold steady when the simulation is restarted with different random seeds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar setups could let researchers test how changes in government policy alter agent behavior by simply updating the prompts given to the language models.
  • The approach might be extended to simulate responses to external shocks such as supply-chain disruptions by feeding new information into the agents' reasoning steps.
  • Calibration against real city data could allow the virtual map to reflect observed patterns of firm location and growth.

Load-bearing premise

Large language models will produce realistic and adaptive economic decisions when cast as households, firms, and policy institutions.

What would settle it

Running multiple independent simulations and checking whether the generated data series for unemployment and inflation consistently show the inverse relationship predicted by the Phillips Curve.

Figures

Figures reproduced from arXiv: 2510.01297 by Hongyu Su, Tianxing He, Yeqi Feng, Yixin Tao, Yucheng Lu.

Figure 1
Figure 1. Figure 1: The framework of SimCity. Left: A visualized map with three types of buildings. Right: [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Emergence of the Phillips Curve and Okun’s Law in SimCity simulations. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Beveridge Curve and other macroeconomic emergences from SimCity. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The results from different random seeds demonstrate that the observed regularity is robust. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: GDP & population curves, and map changes during the move-in phase. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: External impulse (at year 15) does not significantly affect tendency of prices of goods. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: External impulse does not significantly affect long-run prices. [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Structure of prompts. E.1 PROMPTING STRUCTURE We aim to leverage the common-sense capabilities of large language models to act as human-like, heterogeneous agents [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
read the original abstract

Large Language Models (LLMs) open new possibilities for constructing realistic and interpretable macroeconomic simulations. We present SimCity, a multi-agent framework that leverages LLMs to model an interpretable macroeconomic system with heterogeneous agents and rich interactions. Unlike classical equilibrium models that limit heterogeneity for tractability, or traditional agent-based models (ABMs) that rely on hand-crafted decision rules, SimCity enables flexible, adaptive behavior with transparent natural-language reasoning. Within SimCity, four core agent types (households, firms, a central bank, and a government) deliberate and participate in a frictional labor market, a heterogeneous goods market, and a financial market. Furthermore, a Vision-Language Model (VLM) determines the geographic placement of new firms and renders a mapped virtual city, allowing us to study both macroeconomic regularities and urban expansion dynamics within a unified environment. To evaluate the framework, we compile a checklist of canonical macroeconomic phenomena, including price elasticity of demand, Engel's Law, Okun's Law, the Phillips Curve, and the Beveridge Curve, and show that SimCity naturally reproduces these empirical patterns while remaining robust across simulation runs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces SimCity, a multi-agent framework that employs LLMs to simulate a macroeconomic system with four heterogeneous agent types (households, firms, central bank, government) interacting in frictional labor, goods, and financial markets. A VLM component handles geographic firm placement and generates a mapped virtual city. The central claim is that this setup reproduces canonical empirical patterns—price elasticity of demand, Engel's Law, Okun's Law, the Phillips Curve, and the Beveridge Curve—naturally and robustly across runs, while providing transparent natural-language reasoning that contrasts with hand-crafted rules in traditional ABMs or restrictive assumptions in equilibrium models.

Significance. If the reproduced patterns can be shown to emerge specifically from the described market frictions, agent heterogeneity, and adaptive deliberations rather than LLM pretraining, the framework would provide a useful advance in interpretable multi-agent macroeconomic modeling. The unified treatment of macro regularities and urban expansion dynamics via VLM is a distinctive feature that could support new research at the intersection of agent-based systems and spatial economics.

major comments (3)
  1. [Abstract] Abstract: The claim that SimCity 'naturally reproduces' the listed macroeconomic phenomena (price elasticity, Engel's Law, Okun's Law, Phillips Curve, Beveridge Curve) is load-bearing for the paper's contribution, yet the abstract and evaluation provide no quantitative metrics, statistical tests, or benchmark comparisons (e.g., regression slopes or R² values against empirical data) to substantiate the reproduction.
  2. [Evaluation section] Evaluation section: To establish that the patterns arise from the frictional markets and heterogeneous agent interactions rather than from LLMs' pre-trained knowledge of economic relationships, the manuscript requires explicit controls such as ablation runs with prompts that suppress economic terminology, comparison to non-LLM rule-based baselines, or tests with randomized agent priors; without these, attribution to the simulation framework remains insecure.
  3. [Methods/Implementation] Methods/Implementation: The description of agent deliberation and market clearing lacks sufficient detail on prompting strategies, temperature settings, context length, or mechanisms to prevent implicit fitting through prompt engineering, which directly affects the robustness claim across simulation runs.
minor comments (2)
  1. [Abstract] The abstract would be clearer if it briefly stated the number of agents, simulation duration, or scale of the virtual city used in the reported runs.
  2. Notation for agent types and market interactions could be standardized earlier to aid readability when describing the labor/goods/financial market setup.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address each major comment below and commit to revisions that strengthen the manuscript's claims regarding reproducibility and attribution.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that SimCity 'naturally reproduces' the listed macroeconomic phenomena (price elasticity, Engel's Law, Okun's Law, Phillips Curve, Beveridge Curve) is load-bearing for the paper's contribution, yet the abstract and evaluation provide no quantitative metrics, statistical tests, or benchmark comparisons (e.g., regression slopes or R² values against empirical data) to substantiate the reproduction.

    Authors: We appreciate this observation. The manuscript currently demonstrates the patterns via consistent simulation outcomes and visualizations across runs. To provide stronger substantiation, we will revise the abstract and evaluation section to incorporate quantitative metrics such as regression slopes, R² values, and comparisons against empirical benchmarks. revision: yes

  2. Referee: [Evaluation section] Evaluation section: To establish that the patterns arise from the frictional markets and heterogeneous agent interactions rather than from LLMs' pre-trained knowledge of economic relationships, the manuscript requires explicit controls such as ablation runs with prompts that suppress economic terminology, comparison to non-LLM rule-based baselines, or tests with randomized agent priors; without these, attribution to the simulation framework remains insecure.

    Authors: This is a fair point on causal attribution. Our results show the patterns emerging robustly from heterogeneous agents operating in frictional markets with natural-language deliberation. We will add explicit ablation experiments, including non-LLM rule-based baselines and prompt modifications, to the revised evaluation to better isolate the framework's contributions. revision: yes

  3. Referee: [Methods/Implementation] Methods/Implementation: The description of agent deliberation and market clearing lacks sufficient detail on prompting strategies, temperature settings, context length, or mechanisms to prevent implicit fitting through prompt engineering, which directly affects the robustness claim across simulation runs.

    Authors: We agree that additional implementation transparency is required for reproducibility. The revised Methods section will include detailed descriptions of prompting strategies, temperature values, context lengths, and safeguards against prompt engineering to support the robustness claims. revision: yes

Circularity Check

0 steps flagged

No circularity: reproduction presented as validation outcome

full rationale

The paper's core claim is that a multi-agent LLM simulation with heterogeneous agents and frictional markets reproduces canonical macroeconomic patterns (price elasticity, Engel's Law, Okun's Law, Phillips Curve, Beveridge Curve) as an emergent result. No equations, parameter fits, or self-citations are shown that reduce the reproduction to an input by construction. The abstract explicitly contrasts the approach with hand-crafted rules and presents the patterns as naturally arising from the setup, which is a standard external-benchmark validation for ABM-style models. No self-definitional, fitted-prediction, or load-bearing self-citation steps are identifiable from the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the capabilities of LLMs and VLMs to simulate agent behaviors and spatial decisions without additional fitted parameters explicitly mentioned.

axioms (1)
  • domain assumption Large language models can generate realistic and adaptive economic decision-making through natural language reasoning.
    This is the core premise enabling the framework over hand-crafted rules.

pith-pipeline@v0.9.0 · 5737 in / 1308 out tokens · 61541 ms · 2026-05-18T10:56:30.092013+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

  1. [1]

    BusinessBuilding

    ISBN 978-0262047203. Andreu Mas-Colell, Michael D. Whinston, and Jerry R. Green.Microeconomic Theory. Oxford University Press, June 1995. ISBN 9780195073409. Qirui Mi, Qipeng Yang, Zijun Fan, Wentian Fan, Heyang Ma, Chengdong Ma, Siyu Xia, Bo An, Jun Wang, and Haifeng Zhang. Econgym: A scalable ai testbed with diverse economic tasks, 2025. URLhttps://arxi...

  2. [2]

    All change rates are derived from the average prices of the past 12 months

    You can adjust your needs for goods based on the price change rate. All change rates are derived from the average prices of the past 12 months

  3. [3]

    If your income increases, you should allocate a higher percentage of your income to luxury items and a lower percentage to food

    Your demand for goods should also reflect your income. If your income increases, you should allocate a higher percentage of your income to luxury items and a lower percentage to food

  4. [4]

    modify_needs_percentage(’Food_Beverages’, 0.1)

    If your income decreases, you should allocate a lower percentage of your income to luxury items and a higher percentage to food. During the past 9 months, your average income is 166.667: - 9 months ago, you were unemployed, and your monthly income was 0.000. - 8 months ago, you were unemployed, and your monthly income was 0.000. [... and other personal hi...

  5. [5]

    The plan should be based on the current city layout and economic situation

  6. [6]

    You may construct various buildings and can build multiple instances of the same type if deemed suitable

    Act as a city planner and provide a detailed development plan. You may construct various buildings and can build multiple instances of the same type if deemed suitable

  7. [7]

    Always refer to the city report, which outlines the current status of the city

  8. [8]

    Determine the type and location of new buildings based on the information provided in the city report

  9. [9]

    Formulate a step-by-step plan for the construction of the buildings

  10. [10]

    ### Report

    Factories should be built together and placed away from residential areas, while service and commercial buildings should be located near residential areas. There is some information about the city which may affect your plan. The information includes two parts: "### Report" and "### Layout" You should then respond to me with Reasoning: Are there any steps ...

  11. [11]

    You should analyze the report and utilize the actions to adjust the strategy for the building

  12. [12]

    reasoning

    You may engage in appropriate equity financing or loan and enhance your output by purchasing capital. You should only respond in JSON format as described below: { "reasoning": "reasoning" [ "action1(para1, para1, ...)", "action2(para1, para1, ...)", ... 29 Preprint ] } Ensure the response can be parsed by Python ‘json.loads‘, e.g.: no trailing commas, no ...