Bridging the Gap between Micro-scale Traffic Simulation and 4D Digital Cityscapes
Pith reviewed 2026-05-10 17:05 UTC · model grok-4.3
The pith
A framework couples SUMO traffic simulations with photorealistic VR city models of Zurich, producing visualizations where users correctly interpret safety risks and spatial audio further alters those judgments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a synchronized data pipeline can render micro-scale SUMO traffic movements inside a photorealistic, real-time VR cityscape while preserving enough fidelity for human observers to correctly judge safety risks; the same study further shows that adding spatialized audio changes those risk judgments, establishing the value of multimodality for traffic perception tasks.
What carries the argument
A C++ pipeline that streams live SUMO vehicle positions and states into Unreal Engine 5's geospatially accurate Zurich model for synchronized 4D rendering, together with an OSC interface that allows external engines to supply spatial audio.
If this is right
- Urban planners can present traffic scenarios to stakeholders through direct perceptual experience rather than through charts or abstract statistics.
- Safety evaluations of proposed traffic changes can incorporate measured human responses to both visual and auditory cues.
- Multimodal simulation becomes a practical requirement when the goal is realistic communication of risk rather than purely quantitative output.
- Real-time 4D environments open the possibility of interactive what-if testing of traffic policies inside an immersive setting.
Where Pith is reading between the lines
- The same pipeline could be tested with live sensor feeds from actual city infrastructure to create predictive rather than purely simulated experiences.
- Different demographic groups might show systematic differences in how they read safety from the same visual-audio combination, suggesting targeted calibration of the framework.
- The approach could be extended to test whether repeated exposure to such VR scenarios changes people's real-world driving or walking behavior around traffic.
- Integration with autonomous-vehicle simulation layers would allow direct comparison of human perceptual thresholds against machine perception of the same scenes.
Load-bearing premise
Participants' safety ratings inside the VR environment correspond to the judgments they would form when encountering the same traffic conditions in the physical world.
What would settle it
A follow-up experiment in which the same participants experience matched real-world traffic scenes in Zurich (on foot or from a vehicle) and their safety ratings are compared directly against the ratings they gave for the corresponding VR sequences.
Figures
read the original abstract
While micro-scale traffic simulations provide essential data for urban planning, they are rarely coupled with the high-fidelity visualization or auralization necessary for effective stakeholder communication. In this work, we present a real-time 4D visualization framework that couples the SUMO traffic with a photorealistic, geospatially accurate VR representation of Zurich in Unreal Engine 5. Our architecture implements a robust C++ data pipeline for synchronized vehicle visualization and features an Open Sound Control (OSC) interface to support external auralization engines. We validate the framework through a user study assessing the correlation between simulated traffic dynamics and human perception. Results demonstrate a high degree of perceptual alignment, where users correctly interpret safety risks from the 4D simulation. Furthermore, our findings indicate that the inclusion of spatialized audio alters the user's sense of safety, showing the importance of multimodality in traffic simulations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes a real-time 4D visualization framework coupling the SUMO micro-scale traffic simulator to a photorealistic, geospatially accurate VR model of Zurich in Unreal Engine 5, with a C++ data pipeline for vehicle synchronization and an OSC interface for external auralization. It validates the system via a user study asserting high perceptual alignment (users correctly interpret safety risks from the simulation) and that spatialized audio alters safety perceptions, thereby demonstrating the value of multimodality.
Significance. If the empirical claims hold with proper quantitative support and external validation, the framework would provide a useful tool for immersive stakeholder communication in urban planning, correctly highlighting the role of audio cues in safety perception. The open C++ pipeline and OSC interface are strengths that support reproducibility and extension by others.
major comments (2)
- [User Study] User Study section: The central claims of 'high degree of perceptual alignment' and audio altering 'the user's sense of safety' are presented without any reported sample sizes, statistical tests, effect sizes, correlation coefficients, or quantitative results. This is load-bearing because the paper's contribution rests on the user-study validation rather than the framework description alone.
- [Validation and Data Pipeline] Validation and Data Pipeline sections: No comparison is made between SUMO outputs and real traffic sensor data for the Zurich scenarios, nor between VR-based safety perceptions and on-site human judgments at the actual locations. This leaves the ecological validity of the 'perceptual alignment' claim unverified for the intended urban-planning application.
minor comments (1)
- [Abstract] Abstract: The summary of results would be clearer if it included at least one concrete quantitative indicator (e.g., percentage agreement or p-value) rather than the qualitative phrase 'high degree of perceptual alignment'.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important areas for strengthening the manuscript. We address each major comment point by point below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [User Study] User Study section: The central claims of 'high degree of perceptual alignment' and audio altering 'the user's sense of safety' are presented without any reported sample sizes, statistical tests, effect sizes, correlation coefficients, or quantitative results. This is load-bearing because the paper's contribution rests on the user-study validation rather than the framework description alone.
Authors: We agree that the user study reporting requires additional quantitative detail to properly support the claims. In the revised manuscript, we will expand the User Study section to include the participant sample size, the specific statistical tests performed (such as tests for differences in safety perceptions with and without audio), associated p-values, effect sizes, and any correlation coefficients between simulated traffic metrics and user responses. This will provide the necessary rigor and transparency for the validation. revision: yes
-
Referee: [Validation and Data Pipeline] Validation and Data Pipeline sections: No comparison is made between SUMO outputs and real traffic sensor data for the Zurich scenarios, nor between VR-based safety perceptions and on-site human judgments at the actual locations. This leaves the ecological validity of the 'perceptual alignment' claim unverified for the intended urban-planning application.
Authors: Our contribution centers on the real-time coupling architecture, C++ pipeline, OSC interface, and perceptual validation via controlled VR user studies, building on the established validity of SUMO from prior literature rather than re-validating the simulator outputs. We will revise the manuscript to include an expanded limitations discussion that explicitly acknowledges the lack of direct SUMO-to-sensor comparisons and on-site perceptual benchmarks, along with the rationale for the chosen validation approach. However, we do not have the requisite real-world sensor data or on-site judgment datasets for the specific scenarios. revision: partial
- Direct comparison between SUMO simulation outputs and real traffic sensor data for the Zurich scenarios
- On-site human safety perception judgments at the physical locations for comparison to VR results
Circularity Check
No circularity detected in derivation chain
full rationale
The paper presents an applied engineering framework coupling SUMO traffic simulation with Unreal Engine 5 VR visualization and an OSC audio interface, validated through a user study on perceptual safety alignment. No mathematical derivations, equations, fitted parameters, or predictions are described that could reduce to self-definition, fitted inputs, or self-citation chains. Central claims rest directly on empirical user-study outcomes rather than any load-bearing self-referential logic, uniqueness theorems, or ansatz smuggling. The work is self-contained as a descriptive system contribution with no circular reductions by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Microscopic traffic simulation using sumo
Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun-Pang Fl¨otter¨od, Robert Hilbrich, Leon- hard L ¨ucken, Johannes Rummel, Peter Wagner, and Eva- marie Wießner. Microscopic traffic simulation using sumo. InThe 21st IEEE International Conference on Intelligent Transportation Systems. IEEE, 2018. URLhttps:// elib.dlr.de/124092/. 1, 2
work page 2018
-
[2]
Epic Games. Unreal Engine 5, 2022. URLhttps://www. unrealengine.com. 1
work page 2022
-
[3]
Traci: An interface for coupling road traffic and network simula- tors
Axel Wegener, Michał Piorkowski, Maxim Raya, Horst Hellbr¨uck, Stefan Fischer, and Jean-Pierre Hubaux. Traci: An interface for coupling road traffic and network simula- tors. InProceedings of the 1st ACM International Work- shop on Vehicular Ad Hoc Networks (VANET ’08), pages 31–
-
[4]
ACM, 2008. ISBN 978-1-60558-328-7. doi: 10.1145/ 1400713.1400740. 1, 2
-
[5]
Opensound control specification.http: //www.cnmat.berkeley.edu/OSC/OSC- spec
Matthew Wright. Opensound control specification.http: //www.cnmat.berkeley.edu/OSC/OSC- spec. html, 2002. 1, 2
work page 2002
-
[6]
Michael V orl ¨ander.Fundamentals of Acoustics, Mod- elling, Simulation, Algorithms and Acoustic Virtual Reality. Springer Cham, 2020. 1, 2
work page 2020
-
[7]
Toni Rantanen, Arttu Julin, Juho-Pekka Virtanen, Hannu Hyypp¨a, and Matti T. Vaaja. Open geospatial data integration in game engine for urban digital twin applications.ISPRS In- ternational Journal of Geo-Information, 12(8), 2023. ISSN 2220-9964. doi: 10.3390/ijgi12080310. URLhttps: //www.mdpi.com/2220-9964/12/8/310. 2
-
[8]
Urban digital twins for smart cities and citizens: The case study of herrenberg, germany
Fabian Dembski, Uwe W ¨ossner, Mike Letzgus, Michael Ruddat, and Claudia Yamu. Urban digital twins for smart cities and citizens: The case study of herrenberg, germany. Sustainability, 12(6), 2020. ISSN 2071-1050. doi: 10.3390/ su12062307. URLhttps://www.mdpi.com/2071- 1050/12/6/2307. 2
work page 2020
-
[9]
Cristina Olaverri-Monreal, Javier Errea-Moreno, Alberto D´ıaz-´Alvarez, Carlos Biurrun-Quel, Luis Serrano-Arriezu, and Markus Kuba. Connection of the sumo microscopic traffic simulator and the unity 3d game engine to evaluate v2x communication-based systems.Sensors, 18(12), 2018. ISSN 1424-8220. doi: 10.3390/s18124399. URLhttps: //www.mdpi.com/1424-8220/1...
-
[10]
Mathias Pechinger and Johannes Lindner. Sumonity: Bridg- ing sumo and unity for enhanced traffic simulation expe- riences.SUMO Conference Proceedings, 5:163–177, Jul
-
[11]
doi: 10.52825/scp.v5i.1115. URLhttps://www. tib - op . org / ojs / index . php / scp / article / view/1115. 2
-
[12]
Auralization of urban environments – concepts towards new applications
Jonas Stienen and Michael V orlaender. Auralization of urban environments – concepts towards new applications. 05 2015. 2
work page 2015
-
[13]
Unity app for integrating sound into landscape design and evaluation
Fabian Gutscher, Daniel Goncalves Borges, Nadine Sch ¨utz, and Ulrike Wissen Hayek. Unity app for integrating sound into landscape design and evaluation. InUrban Sound Sym- posium, 2025. 2
work page 2025
-
[14]
J. Storer. Juce: Jules utility class extensions.https:// github.com/juce-framework/JUCE, 2025. 5
work page 2025
-
[15]
Google. 3d photorealistic tiles, 2024. URL https : / / developers . google . com / maps / documentation/tile/3d-tiles. 5
work page 2024
-
[16]
VIVE User Guide.https : / / developer
HTC Corporation. VIVE User Guide.https : / / developer . vive . com / documents / 720 / Vive _ User_Guide.pdf, 2016. 5, 6
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.