pith. sign in

arxiv: 2604.08497 · v1 · submitted 2026-04-09 · 💻 cs.HC · cs.SD

Bridging the Gap between Micro-scale Traffic Simulation and 4D Digital Cityscapes

Pith reviewed 2026-05-10 17:05 UTC · model grok-4.3

classification 💻 cs.HC cs.SD
keywords traffic simulationvirtual reality4D visualizationperceptual validationspatial audiourban planningSUMOUnreal Engine
0
0 comments X

The pith

A framework couples SUMO traffic simulations with photorealistic VR city models of Zurich, producing visualizations where users correctly interpret safety risks and spatial audio further alters those judgments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a real-time system that feeds output from the SUMO traffic simulator into a detailed, geospatially accurate virtual model of Zurich rendered in Unreal Engine 5. This creates an immersive 4D environment with synchronized vehicle movement and an interface for adding external spatial audio. A user study then checks whether people viewing these scenes form safety assessments that match the underlying simulation data. The results show strong agreement on risk levels, yet the presence of spatialized sound measurably changes how safe participants feel. The work targets the common problem that raw traffic data remains hard to communicate effectively to planners and the public.

Core claim

The central claim is that a synchronized data pipeline can render micro-scale SUMO traffic movements inside a photorealistic, real-time VR cityscape while preserving enough fidelity for human observers to correctly judge safety risks; the same study further shows that adding spatialized audio changes those risk judgments, establishing the value of multimodality for traffic perception tasks.

What carries the argument

A C++ pipeline that streams live SUMO vehicle positions and states into Unreal Engine 5's geospatially accurate Zurich model for synchronized 4D rendering, together with an OSC interface that allows external engines to supply spatial audio.

If this is right

  • Urban planners can present traffic scenarios to stakeholders through direct perceptual experience rather than through charts or abstract statistics.
  • Safety evaluations of proposed traffic changes can incorporate measured human responses to both visual and auditory cues.
  • Multimodal simulation becomes a practical requirement when the goal is realistic communication of risk rather than purely quantitative output.
  • Real-time 4D environments open the possibility of interactive what-if testing of traffic policies inside an immersive setting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pipeline could be tested with live sensor feeds from actual city infrastructure to create predictive rather than purely simulated experiences.
  • Different demographic groups might show systematic differences in how they read safety from the same visual-audio combination, suggesting targeted calibration of the framework.
  • The approach could be extended to test whether repeated exposure to such VR scenarios changes people's real-world driving or walking behavior around traffic.
  • Integration with autonomous-vehicle simulation layers would allow direct comparison of human perceptual thresholds against machine perception of the same scenes.

Load-bearing premise

Participants' safety ratings inside the VR environment correspond to the judgments they would form when encountering the same traffic conditions in the physical world.

What would settle it

A follow-up experiment in which the same participants experience matched real-world traffic scenes in Zurich (on foot or from a vehicle) and their safety ratings are compared directly against the ratings they gave for the corresponding VR sequences.

Figures

Figures reproduced from arXiv: 2604.08497 by Jonas Egeler, Longxiang Jiao, Lukas Hofmann, Yiru Yang, Zhanyi Wu.

Figure 1
Figure 1. Figure 1: The default 2D visualization interface provided by [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Our proposed framework in Unreal Engine 5, visualiz [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Architecture overview. The Bridge Actor ingests raw [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: User study results comparing the Morning (slow traffic) and Evening (fast traffic) scenarios. The charts display mean participant [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

While micro-scale traffic simulations provide essential data for urban planning, they are rarely coupled with the high-fidelity visualization or auralization necessary for effective stakeholder communication. In this work, we present a real-time 4D visualization framework that couples the SUMO traffic with a photorealistic, geospatially accurate VR representation of Zurich in Unreal Engine 5. Our architecture implements a robust C++ data pipeline for synchronized vehicle visualization and features an Open Sound Control (OSC) interface to support external auralization engines. We validate the framework through a user study assessing the correlation between simulated traffic dynamics and human perception. Results demonstrate a high degree of perceptual alignment, where users correctly interpret safety risks from the 4D simulation. Furthermore, our findings indicate that the inclusion of spatialized audio alters the user's sense of safety, showing the importance of multimodality in traffic simulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript describes a real-time 4D visualization framework coupling the SUMO micro-scale traffic simulator to a photorealistic, geospatially accurate VR model of Zurich in Unreal Engine 5, with a C++ data pipeline for vehicle synchronization and an OSC interface for external auralization. It validates the system via a user study asserting high perceptual alignment (users correctly interpret safety risks from the simulation) and that spatialized audio alters safety perceptions, thereby demonstrating the value of multimodality.

Significance. If the empirical claims hold with proper quantitative support and external validation, the framework would provide a useful tool for immersive stakeholder communication in urban planning, correctly highlighting the role of audio cues in safety perception. The open C++ pipeline and OSC interface are strengths that support reproducibility and extension by others.

major comments (2)
  1. [User Study] User Study section: The central claims of 'high degree of perceptual alignment' and audio altering 'the user's sense of safety' are presented without any reported sample sizes, statistical tests, effect sizes, correlation coefficients, or quantitative results. This is load-bearing because the paper's contribution rests on the user-study validation rather than the framework description alone.
  2. [Validation and Data Pipeline] Validation and Data Pipeline sections: No comparison is made between SUMO outputs and real traffic sensor data for the Zurich scenarios, nor between VR-based safety perceptions and on-site human judgments at the actual locations. This leaves the ecological validity of the 'perceptual alignment' claim unverified for the intended urban-planning application.
minor comments (1)
  1. [Abstract] Abstract: The summary of results would be clearer if it included at least one concrete quantitative indicator (e.g., percentage agreement or p-value) rather than the qualitative phrase 'high degree of perceptual alignment'.

Simulated Author's Rebuttal

2 responses · 2 unresolved

We thank the referee for their constructive and detailed comments, which highlight important areas for strengthening the manuscript. We address each major comment point by point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [User Study] User Study section: The central claims of 'high degree of perceptual alignment' and audio altering 'the user's sense of safety' are presented without any reported sample sizes, statistical tests, effect sizes, correlation coefficients, or quantitative results. This is load-bearing because the paper's contribution rests on the user-study validation rather than the framework description alone.

    Authors: We agree that the user study reporting requires additional quantitative detail to properly support the claims. In the revised manuscript, we will expand the User Study section to include the participant sample size, the specific statistical tests performed (such as tests for differences in safety perceptions with and without audio), associated p-values, effect sizes, and any correlation coefficients between simulated traffic metrics and user responses. This will provide the necessary rigor and transparency for the validation. revision: yes

  2. Referee: [Validation and Data Pipeline] Validation and Data Pipeline sections: No comparison is made between SUMO outputs and real traffic sensor data for the Zurich scenarios, nor between VR-based safety perceptions and on-site human judgments at the actual locations. This leaves the ecological validity of the 'perceptual alignment' claim unverified for the intended urban-planning application.

    Authors: Our contribution centers on the real-time coupling architecture, C++ pipeline, OSC interface, and perceptual validation via controlled VR user studies, building on the established validity of SUMO from prior literature rather than re-validating the simulator outputs. We will revise the manuscript to include an expanded limitations discussion that explicitly acknowledges the lack of direct SUMO-to-sensor comparisons and on-site perceptual benchmarks, along with the rationale for the chosen validation approach. However, we do not have the requisite real-world sensor data or on-site judgment datasets for the specific scenarios. revision: partial

standing simulated objections not resolved
  • Direct comparison between SUMO simulation outputs and real traffic sensor data for the Zurich scenarios
  • On-site human safety perception judgments at the physical locations for comparison to VR results

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper presents an applied engineering framework coupling SUMO traffic simulation with Unreal Engine 5 VR visualization and an OSC audio interface, validated through a user study on perceptual safety alignment. No mathematical derivations, equations, fitted parameters, or predictions are described that could reduce to self-definition, fitted inputs, or self-citation chains. Central claims rest directly on empirical user-study outcomes rather than any load-bearing self-referential logic, uniqueness theorems, or ansatz smuggling. The work is self-contained as a descriptive system contribution with no circular reductions by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As an applied systems paper, the work relies on standard assumptions from traffic simulation (SUMO), VR rendering (Unreal Engine), and human perception studies, without introducing new free parameters, axioms, or invented entities specific to the central claim.

pith-pipeline@v0.9.0 · 5459 in / 1330 out tokens · 69707 ms · 2026-05-10T17:05:03.544791+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Microscopic traffic simulation using sumo

    Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun-Pang Fl¨otter¨od, Robert Hilbrich, Leon- hard L ¨ucken, Johannes Rummel, Peter Wagner, and Eva- marie Wießner. Microscopic traffic simulation using sumo. InThe 21st IEEE International Conference on Intelligent Transportation Systems. IEEE, 2018. URLhttps:// elib.dlr.de/124092/. 1, 2

  2. [2]

    Unreal Engine 5, 2022

    Epic Games. Unreal Engine 5, 2022. URLhttps://www. unrealengine.com. 1

  3. [3]

    Traci: An interface for coupling road traffic and network simula- tors

    Axel Wegener, Michał Piorkowski, Maxim Raya, Horst Hellbr¨uck, Stefan Fischer, and Jean-Pierre Hubaux. Traci: An interface for coupling road traffic and network simula- tors. InProceedings of the 1st ACM International Work- shop on Vehicular Ad Hoc Networks (VANET ’08), pages 31–

  4. [4]

    ISBN 978-1-60558-328-7

    ACM, 2008. ISBN 978-1-60558-328-7. doi: 10.1145/ 1400713.1400740. 1, 2

  5. [5]

    Opensound control specification.http: //www.cnmat.berkeley.edu/OSC/OSC- spec

    Matthew Wright. Opensound control specification.http: //www.cnmat.berkeley.edu/OSC/OSC- spec. html, 2002. 1, 2

  6. [6]

    Springer Cham, 2020

    Michael V orl ¨ander.Fundamentals of Acoustics, Mod- elling, Simulation, Algorithms and Acoustic Virtual Reality. Springer Cham, 2020. 1, 2

  7. [7]

    Toni Rantanen, Arttu Julin, Juho-Pekka Virtanen, Hannu Hyypp¨a, and Matti T. Vaaja. Open geospatial data integration in game engine for urban digital twin applications.ISPRS In- ternational Journal of Geo-Information, 12(8), 2023. ISSN 2220-9964. doi: 10.3390/ijgi12080310. URLhttps: //www.mdpi.com/2220-9964/12/8/310. 2

  8. [8]

    Urban digital twins for smart cities and citizens: The case study of herrenberg, germany

    Fabian Dembski, Uwe W ¨ossner, Mike Letzgus, Michael Ruddat, and Claudia Yamu. Urban digital twins for smart cities and citizens: The case study of herrenberg, germany. Sustainability, 12(6), 2020. ISSN 2071-1050. doi: 10.3390/ su12062307. URLhttps://www.mdpi.com/2071- 1050/12/6/2307. 2

  9. [9]

    Connection of the sumo microscopic traffic simulator and the unity 3d game engine to evaluate v2x communication-based systems.Sensors, 18(12), 2018

    Cristina Olaverri-Monreal, Javier Errea-Moreno, Alberto D´ıaz-´Alvarez, Carlos Biurrun-Quel, Luis Serrano-Arriezu, and Markus Kuba. Connection of the sumo microscopic traffic simulator and the unity 3d game engine to evaluate v2x communication-based systems.Sensors, 18(12), 2018. ISSN 1424-8220. doi: 10.3390/s18124399. URLhttps: //www.mdpi.com/1424-8220/1...

  10. [10]

    Sumonity: Bridg- ing sumo and unity for enhanced traffic simulation expe- riences.SUMO Conference Proceedings, 5:163–177, Jul

    Mathias Pechinger and Johannes Lindner. Sumonity: Bridg- ing sumo and unity for enhanced traffic simulation expe- riences.SUMO Conference Proceedings, 5:163–177, Jul

  11. [11]

    URLhttps://www

    doi: 10.52825/scp.v5i.1115. URLhttps://www. tib - op . org / ojs / index . php / scp / article / view/1115. 2

  12. [12]

    Auralization of urban environments – concepts towards new applications

    Jonas Stienen and Michael V orlaender. Auralization of urban environments – concepts towards new applications. 05 2015. 2

  13. [13]

    Unity app for integrating sound into landscape design and evaluation

    Fabian Gutscher, Daniel Goncalves Borges, Nadine Sch ¨utz, and Ulrike Wissen Hayek. Unity app for integrating sound into landscape design and evaluation. InUrban Sound Sym- posium, 2025. 2

  14. [14]

    J. Storer. Juce: Jules utility class extensions.https:// github.com/juce-framework/JUCE, 2025. 5

  15. [15]

    3d photorealistic tiles, 2024

    Google. 3d photorealistic tiles, 2024. URL https : / / developers . google . com / maps / documentation/tile/3d-tiles. 5

  16. [16]

    VIVE User Guide.https : / / developer

    HTC Corporation. VIVE User Guide.https : / / developer . vive . com / documents / 720 / Vive _ User_Guide.pdf, 2016. 5, 6