DRIVE-C: A Controlled Corruption Dataset for Autonomous Driving
Pith reviewed 2026-05-12 02:04 UTC · model grok-4.3
The pith
DRIVE-C supplies pixel-aligned pairs of clean and synthetically corrupted driving videos to benchmark how autonomous perception systems respond to known camera degradations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DRIVE-C is a dataset of ten clean real-world driving clips and six hundred corrupted counterparts that applies twelve camera degradation types at five severity levels to create pixel-aligned pairs, complete with per-clip metadata and Global Sensor Health Index annotations, so that perception reliability can be evaluated under fully reproducible controlled corruption conditions.
What carries the argument
The DRIVE-C construction process that starts with anonymized real driving videos and applies physics-inspired synthetic degradations to generate pixel-matched clean and corrupted versions together with reproducible parameters and sensor health labels.
If this is right
- Perception models can be benchmarked on identical inputs with and without each specific degradation to isolate failure modes.
- Systems can be trained or tuned to remain accurate when particular camera degradations appear at known severity levels.
- Out-of-distribution detection methods gain a controlled reference set for labeling inputs as degraded versus nominal.
- Sensor health monitoring algorithms can use the Global Sensor Health Index annotations to validate their predictions against ground-truth corruption parameters.
- Reproducible test clips enable consistent comparison of different autonomous driving pipelines under the same degradation conditions.
Where Pith is reading between the lines
- If the synthetic degradations prove representative, the dataset could serve as a standard reference for regulatory or industry robustness certification of ADAS components.
- The paired structure makes it straightforward to measure the exact performance cost of each degradation type, which could guide hardware choices such as camera placement or redundancy.
- Extending the same alignment approach to additional sensor modalities like lidar or radar would create multi-modal controlled corruption benchmarks.
Load-bearing premise
The chosen synthetic degradations accurately capture the visual effects and interactions of actual camera failures that occur during autonomous driving.
What would settle it
Collect real-world instances of camera failures in driving footage and measure whether the resulting visual artifacts or drops in perception model accuracy differ substantially from the patterns produced by the twelve synthetic types in DRIVE-C.
Figures
read the original abstract
DRIVE-C is a controlled corruption dataset designed to evaluate visual perception robustness in autonomous driving systems. It is built from real-world forward-facing driving videos collected across daytime, nighttime, urban, rural, freeway, and parking environments. Clean clips are anonymized via localized face and license plate blurring, then transformed with physics-inspired synthetic degradations. The dataset contains 10 clean clips and 600 corrupted clips spanning 12 camera degradation types across five severity levels, with per-clip metadata and Global Sensor Health Index (GSHI) annotations. DRIVE-C supports robustness benchmarking, degradation-aware modeling, uncertainty estimation, out-of-distribution (OOD) detection, and sensor health monitoring for Advanced Driver Assistance Systems (ADAS). By providing pixel-aligned clean and degraded video clips with fully reproducible corruption parameters, DRIVE-C offers a structured testbed for studying perception reliability under controlled camera degradation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DRIVE-C, a controlled corruption dataset for autonomous driving perception. It consists of 10 clean real-world forward-facing driving video clips collected across daytime/nighttime, urban/rural, freeway, and parking environments. These are anonymized via localized face and license plate blurring, then transformed using 12 physics-inspired synthetic camera degradations at five severity levels to produce 600 corrupted clips. The dataset provides pixel-aligned clean/degraded pairs, per-clip metadata, and Global Sensor Health Index (GSHI) annotations. It is intended to support robustness benchmarking, degradation-aware modeling, uncertainty estimation, OOD detection, and sensor health monitoring for ADAS by enabling reproducible experiments under controlled degradations.
Significance. If the dataset is released with verified pixel alignment and fully specified reproducible corruption parameters, DRIVE-C would provide a useful structured testbed for isolating the effects of specific camera degradations on perception systems. Controlled synthetic corruptions on real driving footage allow systematic evaluation that is hard to achieve with purely real-world data, facilitating research on robustness, uncertainty, and sensor monitoring. The addition of GSHI annotations extends utility toward practical ADAS health assessment. This aligns with needs in safety-critical autonomous driving where understanding degradation impacts is essential.
major comments (2)
- The exact formula or aggregation method for computing the Global Sensor Health Index (GSHI) from corruption parameters and metadata is not specified in the dataset description or annotations section. This is load-bearing for the claim that DRIVE-C supports sensor health monitoring, as users cannot interpret or extend the index without its definition.
- The manuscript states 600 corrupted clips spanning 12 degradation types and five severities but lacks a table or breakdown (e.g., in the dataset statistics section) showing distribution across the six environment types. Without this, it is unclear whether the claimed diversity is achieved in a balanced way that supports the testbed utility for cross-environment robustness studies.
minor comments (3)
- Add example frames or short video snippets illustrating each of the 12 degradation types at low, medium, and high severity to help readers visualize the controlled corruptions.
- Clarify in the pipeline description whether the anonymization blurring is applied before or after degradation synthesis, and confirm that it does not compromise pixel-level alignment between clean and corrupted versions.
- Include a statement on data and code release plans, including exact parameter files, to substantiate the 'fully reproducible corruption parameters' claim.
Simulated Author's Rebuttal
We thank the referee for the positive recommendation of minor revision and for the constructive comments, which help improve the clarity and utility of the DRIVE-C dataset description. We address each major comment below.
read point-by-point responses
-
Referee: The exact formula or aggregation method for computing the Global Sensor Health Index (GSHI) from corruption parameters and metadata is not specified in the dataset description or annotations section. This is load-bearing for the claim that DRIVE-C supports sensor health monitoring, as users cannot interpret or extend the index without its definition.
Authors: We agree that the explicit formula for GSHI must be provided in the manuscript to support the sensor health monitoring use case. The current version describes the presence of GSHI annotations but does not detail the aggregation. In the revised manuscript we will insert a dedicated subsection (under Dataset Annotations) that states the precise formula, including how corruption parameters, severity levels, and per-clip metadata are combined into the index. This will allow users to interpret, reproduce, and extend GSHI. revision: yes
-
Referee: The manuscript states 600 corrupted clips spanning 12 degradation types and five severities but lacks a table or breakdown (e.g., in the dataset statistics section) showing distribution across the six environment types. Without this, it is unclear whether the claimed diversity is achieved in a balanced way that supports the testbed utility for cross-environment robustness studies.
Authors: We acknowledge that an explicit breakdown is needed to substantiate the claimed environmental diversity. The manuscript currently asserts coverage across daytime/nighttime, urban/rural, freeway, and parking but does not tabulate the counts. In the revised version we will add a table in the Dataset Statistics section that reports the number of clips per environment type, further stratified by degradation type and severity level. This will confirm balanced representation and strengthen the argument for cross-environment robustness evaluation. revision: yes
Circularity Check
No significant circularity; dataset construction is self-contained
full rationale
The paper is a dataset release describing collection of real driving videos, anonymization via blurring, and application of physics-inspired synthetic degradations to produce pixel-aligned clean/degraded pairs with metadata. No equations, fitted parameters, predictions, or derivation chains exist. The central claim (provision of a controlled, reproducible testbed) follows directly from the pipeline description without reduction to inputs by construction or self-citation. No load-bearing steps require external verification beyond the stated reproducibility of the transforms.
Axiom & Free-Parameter Ledger
invented entities (1)
-
Global Sensor Health Index (GSHI)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The dataset contains 10 clean clips and 600 corrupted clips spanning 12 camera degradation types across five severity levels
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Vi- sion meets robotics: The kitti dataset,
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vi- sion meets robotics: The kitti dataset,”International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013
work page 2013
-
[2]
nuscenes: A multimodal dataset for au- tonomous driving,
H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for au- tonomous driving,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, 2020, pp. 11 621–11 631
work page 2020
-
[3]
Scalability in perception for au- tonomous driving: Waymo open dataset,
P. Sunet al., “Scalability in perception for au- tonomous driving: Waymo open dataset,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2446–2454
work page 2020
-
[4]
Benchmarking neu- ral network robustness to common corruptions and perturbations,
D. Hendrycks and T. Dietterich, “Benchmarking neu- ral network robustness to common corruptions and perturbations,” inProceedings of the International Conference on Learning Representations, 2019
work page 2019
-
[5]
Descrip- tor: Simon fraser university electric vehicle parking dataset (sfu-evp),
S. Makonin, I. Ziyat, R. Sampson, S. An, F. Popowich, P. Palmer, and D. Agosti, “Descrip- tor: Simon fraser university electric vehicle parking dataset (sfu-evp),”IEEE Data Descriptions, vol. 1, pp. 13–21, 2024
work page 2024
-
[6]
Drive-c: A controlled corruption dataset for autonomous driving,
S. Aher, “Drive-c: A controlled corruption dataset for autonomous driving,” https://doi.org/10.5281/ zenodo.19656444, 2026, zenodo dataset
work page 2026
-
[7]
Dji osmo action 4 product specifications,
DJI, “Dji osmo action 4 product specifications,” https://www.dji.com/osmo-action-4/specs, 2024
work page 2024
-
[8]
G. Jocher, A. Chaurasia, and J. Qiu, “Ultralyt- ics yolo,” https://github.com/ultralytics/ultralytics, 2024
work page 2024
-
[9]
G. Bradski, “The opencv library,”Dr. Dobb’s Jour- nal of Software Tools, 2000
work page 2000
-
[10]
Pytorch: An imperative style, high- performance deep learning library,
A. Paszkeet al., “Pytorch: An imperative style, high- performance deep learning library,” inAdvances in Neural Information Processing Systems, 2019
work page 2019
-
[11]
J. Buchner, “Imagehash,” https://github.com/ JohannesBuchner/imagehash, 2024
work page 2024
-
[12]
S. Aher, “Safety-critical camera reliability moni- toring for adas via degradation-aware uncertainty pattern analysis,” https://arxiv.org/abs/2605.05439, 2026, arXiv preprint. 9
work page internal anchor Pith review Pith/arXiv arXiv 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.