PARSE improves domain generalization accuracy by factoring recognition into visual primitives and their spatial relational compositions learned end-to-end with differentiable predicates.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
MARCO achieves new state-of-the-art semantic correspondence on SPair-71k, AP-10K and PF-PASCAL by combining coarse-to-fine refinement with self-distillation on DINOv2, delivering larger gains at fine thresholds and on unseen keypoints and categories while using 3x fewer parameters and running 10x更快.
VANGUARD is a staged-training VLM framework that reports 94% ROC-AUC and 84% F1 on UCF-Crime while adding chain-of-thought reasoning and spatial grounding to video anomaly detection.
citing papers explorer
-
Domain Generalization through Spatial Relation Induction over Visual Primitives
PARSE improves domain generalization accuracy by factoring recognition into visual primitives and their spatial relational compositions learned end-to-end with differentiable predicates.
-
MARCO: Navigating the Unseen Space of Semantic Correspondence
MARCO achieves new state-of-the-art semantic correspondence on SPair-71k, AP-10K and PF-PASCAL by combining coarse-to-fine refinement with self-distillation on DINOv2, delivering larger gains at fine thresholds and on unseen keypoints and categories while using 3x fewer parameters and running 10x更快.
-
Reasoning-Guided Grounding: Elevating Video Anomaly Detection through Multimodal Large Language Models
VANGUARD is a staged-training VLM framework that reports 94% ROC-AUC and 84% F1 on UCF-Crime while adding chain-of-thought reasoning and spatial grounding to video anomaly detection.