pith:PEBP4TFO
Can VLMs Unlock Semantic Anomaly Detection? A Framework for Structured Reasoning
Structured reasoning framework lets VLMs detect semantic anomalies in driving scenes with 18.5 percent higher recall.
arxiv:2510.18034 v3 · 2025-10-20 · cs.CV · cs.AI · cs.RO
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PEBP4TFO5XRL37FWVRS5A2Q35Y}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Applying SAVANT improves VLM's absolute recall by approximately 18.5% compared to prompting baselines, and leveraging the best proprietary model within the framework enables automatic labeling of around 10,000 images to fine-tune a 7B open-source model achieving 90.8% recall and 93.8% accuracy.
The evaluation uses a 'balanced set of real-world driving scenarios' whose selection criteria and representativeness of long-tail anomalies are not specified, which is required to support that the reported recall gains are due to the structured reasoning pipeline rather than dataset construction.
SAVANT boosts VLM recall for semantic anomaly detection in driving images by 18.5% via structured reasoning and enables fine-tuning a 7B open model to 90.8% recall and 93.8% accuracy.
Formal links
Receipt and verification
| First computed | 2026-05-21T01:05:11.175417Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
7902fe4caeede2bdfcb6ac65d06a1bee3be3190dc6b428d4b303493b606df8d6
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PEBP4TFO5XRL37FWVRS5A2Q35Y \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7902fe4caeede2bdfcb6ac65d06a1bee3be3190dc6b428d4b303493b606df8d6
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "593f05e84dc7ad1f8d210d0208c7e01fd913cc1d269b611aad76b7acca25ea21",
"cross_cats_sorted": [
"cs.AI",
"cs.RO"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2025-10-20T19:14:29Z",
"title_canon_sha256": "aa1b55cc34fcf2b2d0ea87381c2ee6c38355f7435c3639f9b8b6cb07a037c177"
},
"schema_version": "1.0",
"source": {
"id": "2510.18034",
"kind": "arxiv",
"version": 3
}
}