A framework mines spatial, functional, and qualitative commonsense constraints from SGG training data and uses them to correct ranked predictions at inference, yielding consistent gains on three benchmarks.
Symbolic rule extraction from attention-guided sparse representations in vision transformers.Theory and Practice of Logic Programming, 25 (4):722–738, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Visual Commonsense Driven Knowledge Refinements for Scene Graph Generation
A framework mines spatial, functional, and qualitative commonsense constraints from SGG training data and uses them to correct ranked predictions at inference, yielding consistent gains on three benchmarks.