Recognition: unknown
Synscapes: A Photorealistic Synthetic Dataset for Street Scene Parsing
read the original abstract
We introduce Synscapes -- a synthetic dataset for street scene parsing created using photorealistic rendering techniques, and show state-of-the-art results for training and validation as well as new types of analysis. We study the behavior of networks trained on real data when performing inference on synthetic data: a key factor in determining the equivalence of simulation environments. We also compare the behavior of networks trained on synthetic data and evaluated on real-world data. Additionally, by analyzing pre-trained, existing segmentation and detection models, we illustrate how uncorrelated images along with a detailed set of annotations open up new avenues for analysis of computer vision systems, providing fine-grain information about how a model's performance changes according to factors such as distance, occlusion and relative object orientation.
This paper has not been read by Pith yet.
Forward citations
Cited by 6 Pith papers
-
CARD: A Multi-Modal Automotive Dataset for Dense 3D Reconstruction in Challenging Road Topography
CARD is a new multi-modal driving dataset delivering ~500K dense depth pixels per frame from challenging road topographies using stereo cameras and fused LiDARs over 110 km.
-
Multi-Modal Guided Multi-Source Domain Adaptation for Object Detection
MS-DePro achieves state-of-the-art performance on multi-source domain adaptation benchmarks for object detection by using depth-guided region proposals and multi-modal alignment of learnable text embeddings.
-
MULTI: Disentangling Camera Lens, Sensor, View, and Domain for Novel Image Generation
MULTI uses two-stage textual inversion to disentangle camera lens, sensor, view, and domain factors for novel image generation, supporting dataset extension and ControlNet modifications on the new DF-RICO benchmark.
-
Syn4D: A Multiview Synthetic 4D Dataset
Syn4D is a new multiview synthetic 4D dataset supplying dense ground-truth annotations for dynamic scene reconstruction, tracking, and human pose estimation.
-
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
MoGe-2 recovers metric-scale 3D point maps with fine details from single images via data refinement and extension of affine-invariant predictions.
-
AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation
AtteConDA adds attention-based conflict suppression to multi-condition diffusion models so that generated driving-scene images retain richer structural cues from the original annotations.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.