{"paper":{"title":"Conditional Compatibility Learning for Context-Dependent Anomaly Detection","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Global representations that mix subject and context are provably non-identifiable for context-dependent anomalies.","cross_cats":["cs.LG"],"primary_cat":"cs.CV","authors_text":"Didier Stricker, Jason Rambach, Shashank Mishra","submitted_at":"2026-01-30T11:48:20Z","abstract_excerpt":"Anomaly detection usually assumes that abnormality is an intrinsic property of an observation. A defect is a defect, and a rare object is rare, regardless of where it appears. Many real-world anomalies do not work this way. A runner on a track is normal, but the same runner on a highway is not. The subject is unchanged; only the context makes it anomalous. This setting, long recognized as contextual anomaly detection, remains largely underexplored in modern vision-language systems. The difficulty is not merely empirical; it is formal. When anomaly labels depend on the relation between a subjec"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"any detector reasoning from a global representation that conflates subject and context is provably non-identifiable: two different subject-context configurations can map to the same embedding while requiring opposite labels, and no such detector can be correct on both.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the proposed disentangled subject- and context-aware representations in CC-CLIP can be learned from single images without additional supervision or labels that would reintroduce the original identifiability problem.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Conditional compatibility learning reframes anomaly detection as checking subject-context fit rather than global deviation, with CC-CLIP delivering state-of-the-art performance on contextual anomalies and competitive results on structural ones.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Global representations that mix subject and context are provably non-identifiable for context-dependent anomalies.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"20c4e5ec9e4a2477f5a748eeeb926bc3675335bebd4e8fd9f267cf4f7c89300c"},"source":{"id":"2601.22868","kind":"arxiv","version":3},"verdict":{"id":"34a56e77-a5c8-472f-8df8-170935ff4f6d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T09:55:17.723289Z","strongest_claim":"any detector reasoning from a global representation that conflates subject and context is provably non-identifiable: two different subject-context configurations can map to the same embedding while requiring opposite labels, and no such detector can be correct on both.","one_line_summary":"Conditional compatibility learning reframes anomaly detection as checking subject-context fit rather than global deviation, with CC-CLIP delivering state-of-the-art performance on contextual anomalies and competitive results on structural ones.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the proposed disentangled subject- and context-aware representations in CC-CLIP can be learned from single images without additional supervision or labels that would reintroduce the original identifiability problem.","pith_extraction_headline":"Global representations that mix subject and context are provably non-identifiable for context-dependent anomalies."},"references":{"count":14,"sample":[{"doi":"10.1016/j.neucom.2020.11.018","year":2021,"title":"URL https://cdn.openai.com/papers/ gpt-4.pdf. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visua","work_id":"3a786313-75bc-4877-865f-ee9272d0910d","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Section A presents additional model details","work_id":"e08c441a-4860-41e1-ae04-abeff48733db","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Section B describes implementation and training de- tails","work_id":"d6e45f12-ca24-4aad-ad75-9dcd188fe4c5","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Section C reports extended ablation studies and addi- tional quantitative results","work_id":"3690945b-45d6-4502-a3ab-6c056969fc10","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Section D details the construction, annotation protocol, and evaluation splits of the CAAD-3K dataset","work_id":"392b48b8-d3c8-49a8-b63d-5f5fbc3592fb","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":14,"snapshot_sha256":"6b8cc12edff40daf372f2e9b24b6dd928a73c4bb52fc16cb66eeab3cb4995334","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"32c867292e2f1dfa84e8c463a809dea7c24bf876937ae0627602f54b462bcf39"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}