Real-World On-Vehicle Evaluation of Embedding-Based Anomaly Detection

Ahmed Abouelazm; Albert Schotschneider; Daniel Bogdoll; Johann Marius Zoellner; Svetlana Pavlitska

arxiv: 2605.19744 · v1 · pith:L57VLWCUnew · submitted 2026-05-19 · 💻 cs.CV

Real-World On-Vehicle Evaluation of Embedding-Based Anomaly Detection

Albert Schotschneider , Daniel Bogdoll , Svetlana Pavlitska , Ahmed Abouelazm , Johann Marius Zoellner This is my paper

Pith reviewed 2026-05-20 05:18 UTC · model grok-4.3

classification 💻 cs.CV

keywords anomaly detectionautonomous drivingvision transformernearest neighborembedding spacereal-world evaluationroad anomalies

0 comments

The pith

A pretrained vision transformer embedding with nearest-neighbor matching to one reference image can detect and localize anomalies in real-world driving scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors propose using embeddings from a pretrained vision transformer to detect anomalies by measuring how different each patch is from the closest patch in a single reference image of a normal scene. This approach requires no training on anomalous examples or dataset-specific fine-tuning and runs in real time to produce masks showing where unusual objects appear. They test it on the Road Anomaly benchmark where it performs well and then deploy it on an automated vehicle to see consistent highlighting of semantically unusual items like unexpected obstacles in varied traffic conditions. A sympathetic reader would care because collecting representative anomaly data is hard for safety-critical systems like self-driving cars, so a method that works from normality alone could simplify deployment.

Core claim

The central claim is that simple nearest-neighbor similarity in the feature space of a pretrained vision transformer, using patch-wise processing and only a single reference image to define normality, produces effective dense anomaly masks for traffic scenes. This holds both on standard benchmarks and in real on-vehicle evaluations where it highlights semantically unusual objects without supervision or retraining.

What carries the argument

Patch-wise nearest-neighbor similarity in pretrained vision transformer embeddings to model normality from a single reference image and generate dense anomaly localization masks.

If this is right

The method can adapt to diverse real-world scenarios without collecting new anomalous data or retraining.
It enables real-time operation suitable for on-vehicle use in autonomous driving.
Dense masks allow not just detection but localization of anomalies for potential follow-up actions.
Simple reference-based methods provide useful anomaly signals under realistic conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the single reference works across scenes, it may reduce the data requirements for anomaly detection in other perception tasks.
Combining this with multi-reference or dynamic reference updating could improve robustness to changing conditions like weather.
Success here suggests foundation models embed enough semantic structure to separate normal from unusual without explicit labels.

Load-bearing premise

That nearest-neighbor similarity to patches from just one reference image in embedding space is enough to represent normality and catch meaningful anomalies in many different traffic situations.

What would settle it

Running the method on a large set of real driving scenes containing known anomalies such as animals or construction debris on the road and checking whether the anomaly masks reliably highlight those objects while avoiding false alarms on normal variations.

Figures

Figures reproduced from arXiv: 2605.19744 by Ahmed Abouelazm, Albert Schotschneider, Daniel Bogdoll, Johann Marius Zoellner, Svetlana Pavlitska.

**Figure 2.** Figure 2: Qualitative evaluation on anomaly detection bench [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative real-world evaluation. From left to right: input image, PCA embeddings, anomaly map, binary anomaly map. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

Detecting anomalies in traffic scenes is crucial for ensuring safety in autonomous driving, yet collecting representative anomalous data remains challenging. Existing anomaly detection methods are highly specialized and rely on normality as defined by the abstract semantic Cityscapes classes, making it difficult to adapt to diverse real-world scenarios. We propose an adaptable real-time anomaly detection method that leverages foundation models in the form of pretrained vision transformer embeddings to detect deviations via nearest-neighbor similarity in the latent semantic feature space. Based on patch-wise processing, the algorithm produces dense anomaly masks, allowing for the localization of detected anomalies. The method robustly models normality through a single reference image. This formulation avoids explicit supervision and dataset-specific training, making it suitable for real-world deployment. We evaluate the method on standard benchmarks and on an automated vehicle in real-world scenarios. Despite its simplicity, the method achieves good performance on the Road Anomaly benchmark and demonstrates consistent qualitative behavior in practice, successfully highlighting semantically unusual objects in diverse scenes. These results suggest that simple, reference-based methods can provide useful anomaly signals under realistic operating conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows a simple single-reference nearest-neighbor method on pretrained ViT embeddings can flag anomalies in driving scenes without training, with some on-vehicle qualitative results, but the robustness under real variations looks under-tested.

read the letter

The main thing here is that the authors take a basic nearest-neighbor search in pretrained vision transformer patch embeddings and test it for anomaly detection using only one reference image. They run it on a real automated vehicle and say it produces usable dense masks that pick out unusual objects in traffic scenes without any supervision or dataset-specific training. That on-vehicle angle is the clearest addition beyond existing embedding-based anomaly work in computer vision. The method stays deliberately simple, which fits the goal of adaptability when anomalous data is scarce. They report it reaches good performance on the Road Anomaly benchmark and shows consistent qualitative behavior across diverse real scenes. Those parts read as straightforward and practically motivated. The evaluation includes both standard benchmarks and actual vehicle deployment, which gives the work a concrete deployment flavor that many purely benchmark papers lack. The soft spot is the central modeling assumption. A single reference image is asked to capture normality across lighting changes, weather, viewpoints, and vehicle types that occur in real driving. If the embedding distances are more sensitive to low-level appearance shifts than to semantic anomalies, false positives on normal variations and misses on subtle ones become likely. The abstract does not spell out reference selection, adaptation steps, or explicit controls for these factors, so the stress-test concern about scene variations holds weight unless the full results section shows otherwise. The quantitative claims also need the actual numbers, baselines, and error breakdowns to judge whether the performance edge is meaningful or just adequate. This paper is for researchers and engineers focused on lightweight, low-data anomaly signals for autonomous driving safety. A reader who wants deployable methods without heavy retraining pipelines would find the real-world tests useful. It deserves a serious referee to examine the metrics, reference handling, and whether the robustness claims survive closer scrutiny on the data.

Referee Report

3 major / 2 minor

Summary. The paper proposes a simple, reference-based anomaly detection method for traffic scenes that uses pretrained vision transformer patch embeddings and nearest-neighbor similarity to a single reference image to generate dense anomaly masks. The approach requires no supervision or dataset-specific training and is evaluated on the Road Anomaly benchmark as well as in real-world on-vehicle tests, where it is claimed to achieve good performance and consistent qualitative behavior in highlighting semantically unusual objects.

Significance. If the quantitative results and robustness claims hold under scrutiny, the work would demonstrate that lightweight, foundation-model-based nearest-neighbor methods can provide useful anomaly signals in diverse real-world driving conditions without retraining, offering a practical alternative to class-specific supervised approaches.

major comments (3)

[§4] §4 (Evaluation on Road Anomaly benchmark): the abstract and results section assert 'good performance' yet provide no numerical metrics (e.g., AUROC, FPR@95%TPR), no comparison to baselines, and no error analysis or definition of how anomalies were labeled; this prevents verification of the central empirical claim.
[§3.1–3.2] §3.1–3.2 (Method and single-reference modeling): the assumption that nearest-neighbor distance in ViT embedding space to one fixed reference image suffices to separate semantic anomalies from normal scene variations (lighting, weather, viewpoint) is load-bearing for the 'adaptable' and 'real-world deployment' assertions, but no sensitivity analysis, reference-selection protocol, or controls for appearance shifts are reported.
[§5] §5 (Real-world on-vehicle evaluation): the claim of 'consistent qualitative behavior' and successful highlighting of unusual objects rests on visual examples alone; without quantitative metrics, false-positive rates under varying conditions, or explicit anomaly definitions, the link from data to the deployment conclusion cannot be assessed.

minor comments (2)

[Abstract and §4] The abstract states the method is 'real-time' but no latency or frame-rate numbers are supplied in the experimental section.
[§3.2] Notation for the anomaly score (nearest-neighbor distance) should be defined explicitly with an equation rather than described only in prose.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their thorough review and valuable feedback on our manuscript. We have carefully considered each comment and provide point-by-point responses below. Where appropriate, we will revise the manuscript to address the concerns raised.

read point-by-point responses

Referee: [§4] §4 (Evaluation on Road Anomaly benchmark): the abstract and results section assert 'good performance' yet provide no numerical metrics (e.g., AUROC, FPR@95%TPR), no comparison to baselines, and no error analysis or definition of how anomalies were labeled; this prevents verification of the central empirical claim.

Authors: We agree with the referee that the evaluation on the Road Anomaly benchmark would benefit from explicit numerical metrics and comparisons. Although the manuscript emphasizes the method's performance through qualitative results and its applicability to real-world scenarios, we will incorporate AUROC, FPR@95%TPR, baseline comparisons, error analysis, and a clear definition of anomaly labeling in the revised §4 to allow for better verification of the empirical claims. revision: yes
Referee: [§3.1–3.2] §3.1–3.2 (Method and single-reference modeling): the assumption that nearest-neighbor distance in ViT embedding space to one fixed reference image suffices to separate semantic anomalies from normal scene variations (lighting, weather, viewpoint) is load-bearing for the 'adaptable' and 'real-world deployment' assertions, but no sensitivity analysis, reference-selection protocol, or controls for appearance shifts are reported.

Authors: The single-reference modeling is a key feature enabling adaptability without retraining. To strengthen this, we will add a sensitivity analysis to the choice of reference image, including variations in lighting, weather, and viewpoint, as well as a protocol for reference selection. Additional experiments will be included to demonstrate robustness to these appearance shifts. revision: yes
Referee: [§5] §5 (Real-world on-vehicle evaluation): the claim of 'consistent qualitative behavior' and successful highlighting of unusual objects rests on visual examples alone; without quantitative metrics, false-positive rates under varying conditions, or explicit anomaly definitions, the link from data to the deployment conclusion cannot be assessed.

Authors: We recognize that the real-world evaluation is qualitative in nature. In the revised manuscript, we will provide more explicit definitions of what constitutes an anomaly in the deployment context and expand on the test conditions and varying scenarios encountered. However, quantitative metrics such as false-positive rates are challenging to obtain without ground-truth labels, which were not collected during the on-vehicle tests. revision: partial

standing simulated objections not resolved

Obtaining quantitative false-positive rates and other metrics for the real-world on-vehicle evaluation due to the absence of ground-truth anomaly annotations in the deployment data.

Circularity Check

0 steps flagged

Empirical reference-based method exhibits no circularity

full rationale

The paper presents a straightforward algorithmic procedure that computes anomaly scores from nearest-neighbor distances in pretrained ViT patch embeddings relative to a single reference image. No equations, fitted parameters, or derivations are introduced that reduce the reported outputs to the method definition itself. Performance assertions rest on external benchmark evaluation and on-vehicle testing rather than any self-referential construction or self-citation chain. The approach is therefore self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that off-the-shelf vision transformer embeddings already encode semantic distinctions useful for anomaly detection; no new free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Pretrained vision transformer embeddings capture semantic information sufficient to distinguish normal from anomalous traffic-scene content via nearest-neighbor distance.
Invoked when the method models normality from a single reference image without further training or supervision.

pith-pipeline@v0.9.0 · 5732 in / 1388 out tokens · 43079 ms · 2026-05-20T05:18:49.034141+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a minimal, training-free anomaly detection method that models normality from one reference image using pretrained DINOv3 embeddings, where patch-level features from incoming frames are compared via nearest neighbor (NN) similarity

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 2 internal anchors

[1]

Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving

Hermann Blum, Paul-Edouard Sarlin, Juan Nieto, Roland Siegwart, and Cesar Cadena. Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving. InIn- ternational Conference on Computer Vision (ICCV) - Work- shops, 2019. 1, 3

work page 2019
[2]

Anomaly detection in autonomous driving: A survey

Daniel Bogdoll, Maximilian Nitsche, and J Marius Z ¨ollner. Anomaly detection in autonomous driving: A survey. In Conference on Computer Vision and Pattern Recognition (CVPR) - Workshops, pages 4488–4499, 2022. 1

work page 2022
[3]

Perception datasets for anomaly detection in autonomous driving: A survey

Daniel Bogdoll, Svenja Uhlemeyer, Kamil Kowol, and J Marius Z¨ollner. Perception datasets for anomaly detection in autonomous driving: A survey. In2023 IEEE Intelligent Vehicles Symposium (IV), pages 1–8. IEEE, 2023. 1

work page 2023
[4]

R ¨oßler, Fe- lix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Mar- tin Gontscharow, Hanno Gottschalk, and J

Daniel Bogdoll, Iramm Hamdard, Lukas N. R ¨oßler, Fe- lix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Mar- tin Gontscharow, Hanno Gottschalk, and J. Marius Z ¨ollner. AnoV ox: A Benchmark for Multimodal Anomaly Detection in Autonomous Driving. InEuropean Conference on Com- puter Vision (ECCV) Worksho...

work page 2025
[5]

Segmentmeifyou- can: A benchmark for anomaly segmentation

Robin Chan, Krzysztof Lis, Svenja Uhlemeyer, Hermann Blum, Sina Honari, Roland Siegwart, Pascal Fua, Math- ieu Salzmann, and Matthias Rottmann. Segmentmeifyou- can: A benchmark for anomaly segmentation.arXiv preprint arXiv:2104.14812, 2021. 1

work page arXiv 2021
[6]

Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2

Simon Damm, Mike Laszkiewicz, Johannes Lederer, and Asja Fischer. Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2. InWinter Conference on Applications of Computer Vision (WACV), pages 1319–1329. IEEE, 2025. 2

work page 2025
[7]

Outlier detec- tion by ensembling uncertainty with negative objectness

Anja Delic, Matej Grcic, and Sinisa Segvic. Outlier detec- tion by ensembling uncertainty with negative objectness. In British Machine Vision Conference (BMVC), 2024. 2

work page 2024
[8]

Dense out-of-distribution detection by robust learn- ing on synthetic negative data.Sensors, 2024

Matej Grcic, Petra Bevandic, Zoran Kalafatic, and Sinisa Segvic. Dense out-of-distribution detection by robust learn- ing on synthetic negative data.Sensors, 2024. 2

work page 2024
[9]

Marius Z ¨ollner

Marc Heinrich, Maximilian Zipfl, Marc Uecker, Sven Ochs, Martin Gontscharow, Tobias Fleck, Jens Doll, Philip Sch¨orner, Christian Hubschneider, Marc Ren´e Zofka, Alexander Viehl, and J. Marius Z ¨ollner. CoCar NextGen: a Multi-Purpose Platform for Connected Autonomous Driving Research. InInternational Conference on Intelligent Trans- portation Systems (IT...

work page 2024
[10]

Dino-ad: Un- supervised anomaly detection with frozen dino-v3 features

Jiayu Huo, Jingyuan Hong, and Liyun Chen. Dino-ad: Un- supervised anomaly detection with frozen dino-v3 features. arXiv preprint arXiv:2602.03870, 2026. 2

work page arXiv 2026
[11]

Flowclas: Enhancing normalizing flow-based anomaly segmentation via contrastive learning

Chang Won Lee, Selina Leveugle, Paul Grouchy, Chris Langley, Svetlana Stolpner, Jonathan Kelly, and Steven L Waslander. Flowclas: Enhancing normalizing flow-based anomaly segmentation via contrastive learning. InWinter Conference on Applications of Computer Vision (WACV),

work page
[12]

SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling

Camile Lendering, Erkut Akdag, and Egor Bondarev. Sub- spacead: Training-free few-shot anomaly detection via sub- space modeling.CoRR, abs/2602.23013, 2026. 2

work page internal anchor Pith review Pith/arXiv arXiv 2026
[13]

Detecting the unexpected via image resynthe- sis

Krzysztof Lis, Krishna Kanth Nakka, Pascal Fua, and Math- ieu Salzmann. Detecting the unexpected via image resynthe- sis. InInternational Conference on Computer Vision (ICCV), pages 2152–2161. IEEE, 2019. 3

work page 2019
[14]

One stack to rule them all: To drive automated vehicles, and reach for the 4th level.arXiv preprint arXiv:2404.02645, 2024

Sven Ochs, Jens Doll, Daniel Grimm, Tobias Fleck, Marc Heinrich, Stefan Orf, Albert Schotschneider, Helen Grem- melmaier, Rupert Polley, Svetlana Pavlitska, et al. One stack to rule them all: To drive automated vehicles, and reach for the 4th level.arXiv preprint arXiv:2404.02645, 2024. 3

work page arXiv 2024
[15]

Vision foundation model embedding- based semantic anomaly detection.arXiv preprint arXiv:2505.07998, 2025

Max Peter Ronecker, Matthew Foutter, Amine Elhafsi, Daniele Gammelli, Ihor Barakaiev, Marco Pavone, and Daniel Watzenig. Vision foundation model embedding- based semantic anomaly detection.arXiv preprint arXiv:2505.07998, 2025. 1, 2

work page arXiv 2025
[16]

DINOv3

Oriane Sim ´eoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 1, 3

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving

Hermann Blum, Paul-Edouard Sarlin, Juan Nieto, Roland Siegwart, and Cesar Cadena. Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving. InIn- ternational Conference on Computer Vision (ICCV) - Work- shops, 2019. 1, 3

work page 2019

[2] [2]

Anomaly detection in autonomous driving: A survey

Daniel Bogdoll, Maximilian Nitsche, and J Marius Z ¨ollner. Anomaly detection in autonomous driving: A survey. In Conference on Computer Vision and Pattern Recognition (CVPR) - Workshops, pages 4488–4499, 2022. 1

work page 2022

[3] [3]

Perception datasets for anomaly detection in autonomous driving: A survey

Daniel Bogdoll, Svenja Uhlemeyer, Kamil Kowol, and J Marius Z¨ollner. Perception datasets for anomaly detection in autonomous driving: A survey. In2023 IEEE Intelligent Vehicles Symposium (IV), pages 1–8. IEEE, 2023. 1

work page 2023

[4] [4]

R ¨oßler, Fe- lix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Mar- tin Gontscharow, Hanno Gottschalk, and J

Daniel Bogdoll, Iramm Hamdard, Lukas N. R ¨oßler, Fe- lix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Mar- tin Gontscharow, Hanno Gottschalk, and J. Marius Z ¨ollner. AnoV ox: A Benchmark for Multimodal Anomaly Detection in Autonomous Driving. InEuropean Conference on Com- puter Vision (ECCV) Worksho...

work page 2025

[5] [5]

Segmentmeifyou- can: A benchmark for anomaly segmentation

Robin Chan, Krzysztof Lis, Svenja Uhlemeyer, Hermann Blum, Sina Honari, Roland Siegwart, Pascal Fua, Math- ieu Salzmann, and Matthias Rottmann. Segmentmeifyou- can: A benchmark for anomaly segmentation.arXiv preprint arXiv:2104.14812, 2021. 1

work page arXiv 2021

[6] [6]

Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2

Simon Damm, Mike Laszkiewicz, Johannes Lederer, and Asja Fischer. Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2. InWinter Conference on Applications of Computer Vision (WACV), pages 1319–1329. IEEE, 2025. 2

work page 2025

[7] [7]

Outlier detec- tion by ensembling uncertainty with negative objectness

Anja Delic, Matej Grcic, and Sinisa Segvic. Outlier detec- tion by ensembling uncertainty with negative objectness. In British Machine Vision Conference (BMVC), 2024. 2

work page 2024

[8] [8]

Dense out-of-distribution detection by robust learn- ing on synthetic negative data.Sensors, 2024

Matej Grcic, Petra Bevandic, Zoran Kalafatic, and Sinisa Segvic. Dense out-of-distribution detection by robust learn- ing on synthetic negative data.Sensors, 2024. 2

work page 2024

[9] [9]

Marius Z ¨ollner

Marc Heinrich, Maximilian Zipfl, Marc Uecker, Sven Ochs, Martin Gontscharow, Tobias Fleck, Jens Doll, Philip Sch¨orner, Christian Hubschneider, Marc Ren´e Zofka, Alexander Viehl, and J. Marius Z ¨ollner. CoCar NextGen: a Multi-Purpose Platform for Connected Autonomous Driving Research. InInternational Conference on Intelligent Trans- portation Systems (IT...

work page 2024

[10] [10]

Dino-ad: Un- supervised anomaly detection with frozen dino-v3 features

Jiayu Huo, Jingyuan Hong, and Liyun Chen. Dino-ad: Un- supervised anomaly detection with frozen dino-v3 features. arXiv preprint arXiv:2602.03870, 2026. 2

work page arXiv 2026

[11] [11]

Flowclas: Enhancing normalizing flow-based anomaly segmentation via contrastive learning

Chang Won Lee, Selina Leveugle, Paul Grouchy, Chris Langley, Svetlana Stolpner, Jonathan Kelly, and Steven L Waslander. Flowclas: Enhancing normalizing flow-based anomaly segmentation via contrastive learning. InWinter Conference on Applications of Computer Vision (WACV),

work page

[12] [12]

SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling

Camile Lendering, Erkut Akdag, and Egor Bondarev. Sub- spacead: Training-free few-shot anomaly detection via sub- space modeling.CoRR, abs/2602.23013, 2026. 2

work page internal anchor Pith review Pith/arXiv arXiv 2026

[13] [13]

Detecting the unexpected via image resynthe- sis

Krzysztof Lis, Krishna Kanth Nakka, Pascal Fua, and Math- ieu Salzmann. Detecting the unexpected via image resynthe- sis. InInternational Conference on Computer Vision (ICCV), pages 2152–2161. IEEE, 2019. 3

work page 2019

[14] [14]

One stack to rule them all: To drive automated vehicles, and reach for the 4th level.arXiv preprint arXiv:2404.02645, 2024

Sven Ochs, Jens Doll, Daniel Grimm, Tobias Fleck, Marc Heinrich, Stefan Orf, Albert Schotschneider, Helen Grem- melmaier, Rupert Polley, Svetlana Pavlitska, et al. One stack to rule them all: To drive automated vehicles, and reach for the 4th level.arXiv preprint arXiv:2404.02645, 2024. 3

work page arXiv 2024

[15] [15]

Vision foundation model embedding- based semantic anomaly detection.arXiv preprint arXiv:2505.07998, 2025

Max Peter Ronecker, Matthew Foutter, Amine Elhafsi, Daniele Gammelli, Ihor Barakaiev, Marco Pavone, and Daniel Watzenig. Vision foundation model embedding- based semantic anomaly detection.arXiv preprint arXiv:2505.07998, 2025. 1, 2

work page arXiv 2025

[16] [16]

DINOv3

Oriane Sim ´eoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 1, 3

work page internal anchor Pith review Pith/arXiv arXiv 2025