SynthRender and IRIS: Open-Source Framework and Dataset for Bidirectional Sim-Real Transfer in Industrial Object Perception
Pith reviewed 2026-05-21 11:48 UTC · model grok-4.3
The pith
Synthetic data generation with guided domain randomization trains industrial object detectors to over 95 percent accuracy on real imagery without real-world fine-tuning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that an integrated pipeline of 2D-to-3D reality-to-simulation asset creation plus programmatic Guided Domain Randomization inside SynthRender produces synthetic images whose statistics are close enough to real industrial camera output that standard detectors trained only on those images generalize to real test sets at the reported high mAP levels, and that the accompanying IRIS dataset supplies the necessary controlled conditions for measuring bidirectional sim-real transfer.
What carries the argument
Guided Domain Randomization inside the SynthRender framework, which systematically varies a small set of rendering parameters chosen to align synthetic image statistics with target real-camera conditions.
If this is right
- Ablation results yield concrete guidelines on which rendering choices most improve sim-to-real transfer for textured industrial parts.
- The framework supports both sim-to-real training and real-to-sim evaluation because it supplies matching CAD models and reconstructed meshes.
- High scores across three distinct benchmarks indicate the approach scales to different object categories and imaging conditions common in factories.
- Open release of both the generator and the 19,672-annotation IRIS set enables direct replication and extension by other groups working on proprietary parts.
Where Pith is reading between the lines
- The same parameter-tuning discipline could be applied to other perception tasks such as segmentation or pose estimation in similar constrained environments.
- Automated search over the Guided Domain Randomization parameter space might further reduce the manual effort needed to adapt the method to new camera setups.
- Because the method relies on 3D asset reconstruction from real parts, it naturally supports incremental addition of new proprietary objects without redesigning the entire pipeline.
Load-bearing premise
The specific randomization parameters selected in simulation will make the distribution of synthetic images close enough to real industrial photographs that no additional real data or adaptation step is required for good performance.
What would settle it
A side-by-side comparison in which the mAP@50 of a model trained on SynthRender images drops below 80 percent when evaluated on a new real test set whose lighting, background clutter, or sensor noise visibly differs from the ranges used during Guided Domain Randomization.
Figures
read the original abstract
Object perception is fundamental for tasks such as robotic material handling and quality inspection. However, modern supervised deep-learning models require large annotated datasets for robust automation under semi-uncontrolled conditions; a major barrier for widespread deployment with proprietary industrial parts. We address this through an integrated framework combining synthetic data generation and structured empirical evaluation for systematic investigation of bidirectional sim-to-real transfer. Our method integrates 2D-to-3D Reality-to-Simulation techniques for 3D asset creation from physical parts with programmatic Guided Domain Randomization (GDR) via SynthRender, an open-source synthetic image generation framework. Structured ablation studies across multiple benchmarks quantify the impact of individual rendering design choices, yielding practical guidelines for dataefficient synthetic training. To support evaluation under realistic industrial conditions, we introduce Industrial Real-Sim Imagery Set (IRIS), a 32-class dataset with diverse textures, intra-class variation, strong inter-class similarities, and 19,672 annotations, providing both CAD models and reconstructed meshes for bidirectional sim-to-real benchmarking. Across three industrial benchmarks, the proposed framework achieves highly competitive performance, reaching 99.1% mAP@50 on a public robotics dataset, 98.3% mAP@50 on an automotive benchmark, and 95.3% mAP@50 on IRIS.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces SynthRender, an open-source framework for synthetic image generation that combines 2D-to-3D Reality-to-Simulation asset creation with programmatic Guided Domain Randomization (GDR). It also releases the IRIS dataset (32 classes, 19,672 annotations, with CAD models and reconstructed meshes). Structured ablation studies across three industrial benchmarks are used to derive practical guidelines, and the framework is reported to achieve 99.1% mAP@50 on a public robotics dataset, 98.3% mAP@50 on an automotive benchmark, and 95.3% mAP@50 on IRIS, all without real-data fine-tuning or domain adaptation.
Significance. If the central claims hold, the work provides a practical, open-source route to data-efficient training for industrial object perception where real annotated data is scarce or proprietary. The bidirectional sim-real design, the new IRIS benchmark, and the ablation-derived guidelines are concrete contributions that could be adopted by robotics and inspection practitioners. The release of both the rendering framework and the dataset with meshes strengthens reproducibility and enables future comparisons.
major comments (2)
- [§3.2] §3.2 (Guided Domain Randomization): The high mAP claims rest on the assumption that GDR produces image statistics sufficiently close to the three real test distributions. The manuscript supplies only high-level descriptions of GDR; it does not report the exact parameter ranges, sampling distributions, or any quantitative domain-gap metrics (FID, MMD, or per-channel histogram distances) between the generated SynthRender images and the corresponding real robotics/automotive/IRIS images. This omission is load-bearing because the reported performance may reflect fortunate alignment on the chosen objects and capture conditions rather than a general, validated transfer method.
- [§4] §4 (Experiments): The structured ablation studies are presented as evidence that individual rendering choices matter, yet the text does not include error bars, exact train/test splits, or the number of random seeds used. Without these, it is difficult to determine whether the reported gains (e.g., from adding GDR) are statistically reliable or sensitive to post-hoc choices of hyperparameters.
minor comments (2)
- [Abstract and §4.1] The abstract and §4.1 do not state the object detector backbone or training protocol (e.g., YOLOv8, Faster R-CNN, input resolution, optimizer). Adding these details would improve reproducibility.
- [Table 1] Table 1 (dataset statistics) lists 19,672 annotations but does not clarify the number of distinct images or the precise train/validation/test partition used for the IRIS benchmark; this information is needed to interpret the 95.3% mAP figure.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and indicate the revisions we will make to improve clarity, reproducibility, and statistical rigor.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Guided Domain Randomization): The high mAP claims rest on the assumption that GDR produces image statistics sufficiently close to the three real test distributions. The manuscript supplies only high-level descriptions of GDR; it does not report the exact parameter ranges, sampling distributions, or any quantitative domain-gap metrics (FID, MMD, or per-channel histogram distances) between the generated SynthRender images and the corresponding real robotics/automotive/IRIS images. This omission is load-bearing because the reported performance may reflect fortunate alignment on the chosen objects and capture conditions rather than a general, validated transfer method.
Authors: We agree that the current description of Guided Domain Randomization is insufficiently detailed for full reproducibility and that quantitative domain-gap metrics would strengthen the validation of the transfer method. In the revised manuscript we will expand §3.2 to list the exact parameter ranges and sampling distributions employed for each randomization factor. We will also compute and report Fréchet Inception Distance (FID) scores between the SynthRender-generated images and the real images of each benchmark (robotics, automotive, and IRIS) to provide a direct quantitative measure of distributional similarity. revision: yes
-
Referee: [§4] §4 (Experiments): The structured ablation studies are presented as evidence that individual rendering choices matter, yet the text does not include error bars, exact train/test splits, or the number of random seeds used. Without these, it is difficult to determine whether the reported gains (e.g., from adding GDR) are statistically reliable or sensitive to post-hoc choices of hyperparameters.
Authors: We acknowledge that the absence of error bars and explicit experimental protocol details limits the ability to assess statistical reliability. The ablation experiments were run with five independent random seeds, and train/test splits followed the official protocols of each public benchmark. In the revised manuscript we will add error bars (mean ± standard deviation across seeds) to all ablation tables and figures in §4 and will explicitly document the number of seeds together with the precise train/test split ratios used for every experiment. revision: yes
Circularity Check
No significant circularity; empirical benchmarks are self-contained
full rationale
The manuscript presents an open-source synthetic rendering framework (SynthRender) with Guided Domain Randomization and a new 32-class dataset (IRIS) for sim-to-real transfer evaluation. All reported results are direct empirical mAP@50 measurements on three separate industrial benchmarks (99.1 % on public robotics data, 98.3 % on automotive, 95.3 % on IRIS) obtained from ablation studies. These outcomes are measured against held-out real test images and do not reduce to any fitted parameter, self-definition, or self-citation chain. No equations, uniqueness theorems, or ansatz adoptions appear in the provided text that would create a circular derivation; the performance numbers are falsifiable experimental observations rather than constructed predictions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Synthetic images generated via 3D reconstruction and domain randomization can produce training distributions that support high-accuracy models on real industrial imagery.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SynthRender applies DR or GDR according to user-defined rules and ranges... physics simulation for realistic object placement and three-point lighting
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Ablation studies reveal that bounded DR, combining physically plausible scene formation with diverse lighting spectra, nonlinear light intensity sampling, and randomized PBR materials
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Leveraging synthetic training data for object detection to enhance autonomous depalletizing systems,
F. T ¨oper, J. M. Araya-Martinez, A. S. Reig, T. Tom, S. Sardari, and P. Ohlhausen, “Leveraging synthetic training data for object detection to enhance autonomous depalletizing systems,” inEuropean Robotics F orum. Springer, 2025, pp. 229–235
work page 2025
-
[2]
Domain adaptation using vision transformers and xai for fully synthetic industrial train- ing,
J. M. Araya-Martinez, T. Tom, S. Sardari, A. Sanchis Reig, G. Mohan, A. Shukla, F. T ¨oper, J. Lambrecht, and J. Kr ¨uger, “Domain adaptation using vision transformers and xai for fully synthetic industrial train- ing,”Procedia CIRP, vol. 135, 2025, 35th CIRP Design Conference
work page 2025
-
[3]
Foundationpose: Unified 6d pose estimation and tracking of novel objects,
B. Wen, W. Yang, J. Kautz, and S. Birchfield, “Foundationpose: Unified 6d pose estimation and tracking of novel objects,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 17 868–17 879
work page 2024
-
[4]
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo, P. Doll ´ar, and R. Girshick, “Segment anything,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 4015–4026
work page 2023
-
[5]
M. Hussain, “Yolo-v1 to yolo-v8, the rise of yolo and its comple- mentary nature toward digital manufacturing and industrial defect detection,”Machines and Tooling, vol. 11, p. 677, 2023
work page 2023
-
[6]
Deim: Detr with improved matching for fast convergence,
S. Huang, Z. Lu, X. Cun, Y . Yu, X. Zhou, and X. Shen, “Deim: Detr with improved matching for fast convergence,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025, pp. 15 162–15 171
work page 2025
-
[7]
Meshsplats: Mesh-based rendering with gaussian splatting initialization, 2026
R. Tobiasz, G. Wilczy ´nski, M. Mazur, S. Tadeja, and P. Spurek, “Mesh- splats: Mesh-based rendering with gaussian splatting initialization,” arXiv preprint arXiv:2502.07754, 2025
-
[8]
Blenderproc2: A procedural pipeline for photorealistic rendering,
M. Denninger, D. Winkelbauer, M. Sundermeyer, W. Boerdijk, M. Knauer, K. H. Strobl, M. Humt, and R. Triebel, “Blenderproc2: A procedural pipeline for photorealistic rendering,”Journal of Open Source Software, vol. 8, no. 82, p. 4901, 2023. [Online]. Available: https://doi.org/10.21105/joss.04901
-
[9]
E. Coumans, “Bullet physics library,” https://github.com/bulletphysics/ bullet3, accessed: 2025-02-01
work page 2025
-
[10]
T. B. Foundation, “Blender 4.0,” 2023, https://projects.blender.org/ blender/blender.git [Accessed: (12.06.2025)]
work page 2023
-
[11]
Structured 3D Latents for Scalable and Versatile 3D Generation
J. Xiang, Z. Lv, S. Xu, Y . Deng, R. Wang, B. Zhang, D. Chen, X. Tong, and J. Yang, “Structured 3d latents for scalable and versatile 3d generation,”arXiv preprint arXiv:2412.01506, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[12]
Meshy ai - the #1 ai 3d model generator,
MeshyAI Team, “Meshy ai - the #1 ai 3d model generator,” https: //www.meshy.ai/discover, 2025, accessed: 2025-11-18
work page 2025
-
[13]
Object detection using sim2real domain randomization for robotic applica- tions,
D. Horv ´ath, G. Erd ˝os, Z. Istenes, T. Horv ´ath, and S. F ¨oldi, “Object detection using sim2real domain randomization for robotic applica- tions,”IEEE Transactions on Robotics, vol. 39, no. 2, pp. 1225–1243, 2022
work page 2022
-
[14]
X. Zhu, J. Henningsson, D. Li, P. M ˚artensson, L. Hanson, M. Bj ¨orkman, and A. Maki, “Domain randomization for object de- tection in manufacturing applications using synthetic data: A compre- hensive study,” in2025 IEEE International Conference on Robotics and Automation (ICRA), 2025
work page 2025
-
[15]
J. M. Araya-Martinez, S. Sardari, M. Lambert, J. A. Zak, F. T ¨oper, J. Kr ¨uger, and J. Lambrecht, “A data-centric evaluation of leading multi-class object detection algorithms using synthetic industrial data,” inAdvances in Automotive Production Technology – Digital Product Development and Manufacturing, D. Holder, F. Wulle, and J. Lind, Eds. Cham: Spri...
work page 2025
-
[16]
The pascal visual object classes (voc) challenge,
M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisser- man, “The pascal visual object classes (voc) challenge,”International journal of computer vision, vol. 88, pp. 303–338, 2010
work page 2010
-
[17]
Microsoft coco: Common objects in context,
T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll ´ar, and C. L. Zitnick, “Microsoft coco: Common objects in context,” inComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 2014, pp. 740–755
work page 2014
-
[18]
L. Eversverg and J. Lambrecht, “Generating images with physics- based rendering for an industrial object detection task: Realism versus domain randomization,”Sensors, vol. 21, no. 23, p. 7901, 2021
work page 2021
-
[19]
Synthetic industrial object detection: Genai vs. feature-based methods,
J. M. Araya-Martinez, A. Sanchis Reig, G. Mohan, S. Sardari, J. Lam- brecht, and J. Kr ¨uger, “Synthetic industrial object detection: Genai vs. feature-based methods,”Procedia CIRP, 2025, 19th CIRP Conference on Intelligent Computation in Manufacturing Engineering, in press
work page 2025
-
[20]
Domain randomization for transferring deep neural networks from simulation to the real world,
J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in2017 IEEE/RSJ international con- ference on intelligent robots and systems (IROS). IEEE, 2017, pp. 23–30
work page 2017
-
[21]
Towards fully-synthetic train- ing for industrial applications,
C. Mayershofer, T. Ge, and J. Fottner, “Towards fully-synthetic train- ing for industrial applications,” inLISS 2020. Springer Singapore, 2021, pp. 765–782
work page 2020
-
[22]
Towards sim-to-real industrial parts classification with synthetic dataset,
X. Zhu, T. Bilal, P. M ˚artensson, L. Hanson, M. Bj ¨orkman, and A. Maki, “Towards sim-to-real industrial parts classification with synthetic dataset,” in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023, pp. 4454–4463
work page 2023
- [23]
-
[24]
Kiri engine: 3d scanner app for iphone, android, and web,
Kiri Engine Team, “Kiri engine: 3d scanner app for iphone, android, and web,” https://www.kiriengine.app/, 2025, accessed: 2025-10-25
work page 2025
-
[25]
Recovering high dynamic range radiance maps from photographs,
P. E. Debevec and J. Malik, “Recovering high dynamic range radiance maps from photographs,” inProceedings of SIGGRAPH 1997, 1997, pp. 369–378
work page 1997
-
[26]
B. Foundation, “The cycles render engine,” 2023, https://projects. blender.org/blender/cycles.git [Accessed: (12.06.2025)]
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.