ARCANE-PedSynth: Synthetic Multi-Pedestrian Datasets with Behavioural Crossing Annotations
Pith reviewed 2026-06-30 00:55 UTC · model grok-4.3
The pith
ARCANE-PedSynth enables synthetic datasets with pedestrian crossing rates up to 75% via hybrid AI-manual control.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ARCANE-PedSynth overcomes CARLA's native 9% crossing rate through a hybrid AI-manual pedestrian control architecture, enabling configurable target rates up to 75%. A 12-state behavioural finite state machine with five character archetypes produces diverse crossing behaviours. The framework generates synchronised RGB, LiDAR, and DVS data with per-frame crossing labels, behavioural states, and estimated 2D pose keypoints.
What carries the argument
The hybrid AI-manual pedestrian control architecture combined with a 12-state behavioural finite state machine and five character archetypes for generating diverse crossing behaviors.
If this is right
- Configurable crossing rates allow tailored datasets for specific training needs in pedestrian prediction.
- Multi-modal synchronized data streams support advanced sensor fusion models.
- Per-frame annotations enable precise supervised learning for crossing detection.
- Docker containerisation and CLI parameters ensure full reproducibility of generated datasets.
Where Pith is reading between the lines
- The generated datasets may help address data scarcity issues in training models for rare but critical pedestrian events.
- Similar hybrid control methods could be applied to other simulation environments to increase behavioral diversity.
- Validation against real-world pedestrian behavior statistics would strengthen the case for using these synthetic datasets.
Load-bearing premise
The hybrid AI-manual control architecture combined with the 12-state FSM can produce crossing rates and behavioral diversity that are both achievable in simulation and useful for training real-world crossing prediction models.
What would settle it
Training a crossing prediction model on the PedSynth++ dataset and finding it performs worse on real-world test data than a model trained on native CARLA data would falsify the utility of the generated datasets.
Figures
read the original abstract
We present ARCANE-PedSynth, an open-source CARLA-based software framework for generating synthetic multi-pedestrian datasets with dense behavioural annotations for pedestrian crossing prediction in autonomous driving. The framework overcomes CARLA's native 9% crossing rate through a hybrid AI-manual pedestrian control architecture, enabling configurable target rates up to 75%. A 12-state behavioural finite state machine with five character archetypes produces diverse crossing behaviours. The framework generates synchronised RGB, LiDAR, and DVS data with per-frame crossing labels, behavioural states, and estimated 2D pose keypoints. We demonstrate ARCANE-PedSynth through PedSynth++, an example dataset generated with the framework, comprising 533 multi-pedestrian clips across 12 weather conditions with RGB, LiDAR, and DVS streams. ARCANE-PedSynth is fully reproducible via CLI parameterisation and Docker containerisation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents ARCANE-PedSynth, an open-source CARLA-based framework for generating synthetic multi-pedestrian datasets with dense behavioral annotations for crossing prediction. It claims to overcome CARLA's native 9% crossing rate via a hybrid AI-manual pedestrian control architecture that supports configurable target rates up to 75%, using a 12-state behavioral finite state machine with five character archetypes for diverse behaviors. The framework outputs synchronized RGB, LiDAR, and DVS streams with per-frame crossing labels, behavioral states, and 2D pose keypoints. It demonstrates the approach via the PedSynth++ example dataset (533 clips across 12 weather conditions) and emphasizes full reproducibility through CLI parameterization and Docker containerization.
Significance. If the framework performs as described, it offers a practical contribution to autonomous driving research by enabling generation of annotated synthetic data with elevated and controllable pedestrian crossing rates and behavioral variety, which standard CARLA simulations lack. The open-source release, CLI/Docker reproducibility, and multi-modal output streams are clear strengths that support community use and verification. The work addresses a recognized data scarcity issue for crossing prediction models.
major comments (1)
- [Abstract] Abstract: The central claims that the hybrid AI-manual architecture 'overcomes CARLA's native 9% crossing rate' and enables 'configurable target rates up to 75%' are asserted without any validation metrics, achieved-rate statistics, or comparison to baseline CARLA behavior. This quantitative performance assertion is load-bearing for the framework's stated purpose and requires supporting evidence from the generated dataset or experiments.
minor comments (2)
- [Framework description] The description of the 12-state FSM and five archetypes would benefit from a diagram or explicit state-transition table to clarify how behavioral diversity is achieved.
- [Demonstration / PedSynth++] PedSynth++ dataset statistics (533 clips, 12 weather conditions) are given but lack details on how crossing rates were measured or controlled in the generated clips.
Simulated Author's Rebuttal
We thank the referee for their thorough review and for recognizing the potential contribution of ARCANE-PedSynth to addressing data scarcity in pedestrian crossing prediction. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claims that the hybrid AI-manual architecture 'overcomes CARLA's native 9% crossing rate' and enables 'configurable target rates up to 75%' are asserted without any validation metrics, achieved-rate statistics, or comparison to baseline CARLA behavior. This quantitative performance assertion is load-bearing for the framework's stated purpose and requires supporting evidence from the generated dataset or experiments.
Authors: We agree that the abstract presents these performance claims without supporting quantitative evidence, which is a valid observation. The manuscript describes the hybrid control architecture and demonstrates the framework via the PedSynth++ dataset (533 clips), but does not report measured crossing rates, achieved statistics under different target configurations, or direct comparisons against unmodified CARLA pedestrian behavior. We will revise the manuscript by adding a dedicated subsection (and corresponding table) that reports the actual crossing rates observed in PedSynth++ across the 12 weather conditions and configurable target settings, including baseline CARLA runs for comparison. This will substantiate the claims with evidence from the generated data. revision: yes
Circularity Check
No significant circularity: framework description only
full rationale
The paper presents a CARLA-based software framework and dataset generation tool. No mathematical derivations, equations, fitted parameters, predictions, or uniqueness theorems appear in the abstract or described content. Claims concern configurable architecture (hybrid control, 12-state FSM) and output statistics, which are externally verifiable via the stated open-source CLI/Docker release rather than reducing to self-referential inputs. This matches the default non-circular case for engineering/framework papers.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Rasouli, I
A. Rasouli, I. Kotseruba, J. K. Tsotsos, Pedestrian action anticipation using contextual feature fusion in stacked RNNs, in: Proceedings of the British Machine Vision Conference (BMVC), 2019
2019
-
[2]
A. Rasouli, I. Kotseruba, J. K. Tsotsos, Are they going to cross? A benchmark dataset and baseline for pedestrian crosswalk behavior, in: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2017, pp. 206–213.doi:10.1109/ICCVW.2017.33
-
[3]
A. Rasouli, I. Kotseruba, J. K. Tsotsos, Agreeing to cross: How drivers and pedestrians communicate, in: IEEE Intelligent Vehicles Symposium (IV), 2017, pp. 264–269.doi:10.1109/IVS.2017.7995730. 18
-
[4]
A. Rasouli, I. Kotseruba, T. Kunic, J. K. Tsotsos, PIE: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction, in: International Conference on Computer Vision (ICCV), 2019.doi:10.1109/ICCV.2019.00636
-
[5]
Marczak, D., Magistri, S., Cygert, S., Twardowski, B., Bag- danov, A
I. Kotseruba, A. Rasouli, J. K. Tsotsos, Benchmark for Evaluating Pedestrian Action Prediction, in: Proceedings of the IEEE Winter Con- ference on Applications of Computer Vision (WACV), 2021, pp. 1258– 1268.doi:10.1109/WACV48630.2021.00130
-
[6]
H.Caesar, V.Bankiti, A.H.Lang, S.Vora, V.E.Liong, Q.Xu, A.Krish- nan, Y. Pan, G. Baldan, O. Beijbom, nuScenes: A multimodal dataset for autonomous driving, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11621– 11631.doi:10.1109/CVPR42600.2020.01164
-
[7]
S. Malla, B. Dariush, C. Choi, TITAN: Future forecast using ac- tion priors, in: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), 2020, pp. 11186–11196. doi:10.1109/CVPR42600.2020.01120
-
[8]
P. R. G. Cadena, M. Yang, Y. Qian, C. Wang, Pedestrian graph: Pedes- trian crossing prediction based on 2d pose estimation and graph con- volutional networks, in: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019, pp. 2000–2005.doi:10.1109/ITSC.2019. 8917118
-
[9]
In: 2020 IEEE Intelligent Vehicles Symposium (IV)
J. Lorenzo, I. P. Alonso, R. Izquierdo, A. L. Ballardini, A. D. Sappa, I. Parra Alonso, RNN-based pedestrian crossing prediction using activ- ity and pose-related features, in: IEEE Intelligent Vehicles Symposium (IV), 2020, pp. 1801–1806.doi:10.1109/IV47402.2020.9304652
-
[10]
Dosovitskiy, G
A. Dosovitskiy, G. Ros, F. Codevilla, A. López, V. Koltun, CARLA: An open urban driving simulator, in: Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16
2017
-
[11]
A. Gaidon, Q. Wang, Y. Cabon, E. Vig, Virtual worlds as proxy for multi-object tracking analysis, in: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4340–4349.doi:10.1109/CVPR.2016.470
-
[12]
G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez, The SYN- THIA Dataset: A large collection of synthetic images for semantic seg- 19 mentation of urban scenes, in: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 3234–3243.doi:10.1109/CVPR.2016.352
-
[13]
S. R. Richter, V. Vineet, S. Roth, V. Koltun, Playing for data: Ground truth from computer games, in: European Conference on Com- puter Vision (ECCV), Springer, 2016, pp. 102–118.doi:10.1007/ 978-3-319-46475-6_7
2016
-
[14]
Ramesh, M
M. Ramesh, M. Azer, F. Flohr, HABIT: Human action benchmark for interactive traffic in CARLA, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026, pp. 7148–7157
2026
-
[15]
J. Bai, X. Fang, J. Fang, J. Xue, C. Yuan, Deep virtual-to-real distil- lation for pedestrian crossing prediction, in: 2022 IEEE 25th Interna- tional Conference on Intelligent Transportation Systems (ITSC), 2022, pp. 2210–2215.doi:10.1109/ITSC55140.2022.9921771
-
[16]
M. Sakhai, K. Sithu, M. K. S. Oke, M. Wielgosz, DVS-PedX: Synthetic- and-real event-based pedestrian dataset, Scientific Data (2026).doi: 10.1038/s41597-026-06969-y
-
[17]
M. N. Riaz, M. Wielgosz, A. Garcia Romera, A. M. López, Synthetic data generation framework, dataset, and efficient deep model for pedes- trian intention prediction, in: 2023 IEEE 26th International Confer- enceonIntelligentTransportationSystems(ITSC),2023, pp.4807–4814. doi:10.1109/ITSC57777.2023.10422401
-
[18]
M. N. Riaz, M. Wielgosz, A. M. López, Minimizing human labeling in training deep models for pedestrian intention prediction, IEEE Transac- tions on Intelligent Transportation Systems 26 (9) (2025) 13477–13488. doi:10.1109/TITS.2025.3565667
-
[19]
M. N. Riaz, M. Wielgosz, C. Xie, A. M. López, PedGT: Enhanc- ing pedestrian intention prediction using a skeleton-based graph- transformer, in: 2025 IEEE Intelligent Vehicles Symposium (IV), 2025
2025
-
[20]
H.-S. Fang, S. Xie, Y.-W. Tai, C. Lu, RMPE: Regional multi-person pose estimation, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2334–2343.doi:10.1109/ICCV. 2017.256. 20
-
[21]
Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, Y. A. Sheikh, Open- Pose: Realtime multi-person 2D pose estimation using part affinity fields, IEEE Transactions on Pattern Analysis and Machine Intelligence 43 (2021) 172–186.doi:10.1109/TPAMI.2019.2929257
-
[22]
M. M. Ishaque, R. B. Noland, Behavioural issues in pedestrian speed choice and street crossing behaviour: A review, Transport Reviews 28 (1) (2008) 61–85.doi:10.1080/01441640701365239
-
[23]
E. Papadimitriou, G. Yannis, J. Golias, A critical assessment of pedes- trian behaviour models, Transportation Research Part F: Traffic Psy- chology and Behaviour 12 (3) (2009) 242–255.doi:10.1016/j.trf. 2008.12.004. 21
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.