Unifying feature and cost aggregation with transformers for semantic and visual correspondence

Sunghwan Hong, Seokju Cho, Seungryong Kim, Stephen Lin · 2024 · arXiv 2403.11120

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TORA: Topological Representation Alignment for 3D Shape Assembly

cs.CV · 2026-04-05 · unverdicted · novelty 7.0

TORA distills topological structure from pretrained 3D encoders into flow-matching backbones via cosine matching and CKA loss, delivering up to 6.9x faster convergence and better accuracy on 3D shape assembly benchmarks with zero inference overhead.

Entropy-Gradient Grounding: Training-Free Evidence Retrieval in Vision-Language Models

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

Entropy-gradient grounding uses model uncertainty to retrieve evidence regions in VLMs, improving performance on detail-critical and compositional tasks across multiple architectures.

C3G: Learning Compact 3D Representations with 2K Gaussians

cs.CV · 2025-12-03 · unverdicted · novelty 6.0

C3G creates compact 3D Gaussian representations with 2K points by guiding placement via learnable tokens that aggregate multi-view features through attention, yielding better efficiency and performance than dense methods.

citing papers explorer

Showing 3 of 3 citing papers after filters.

TORA: Topological Representation Alignment for 3D Shape Assembly cs.CV · 2026-04-05 · unverdicted · none · ref 15
TORA distills topological structure from pretrained 3D encoders into flow-matching backbones via cosine matching and CKA loss, delivering up to 6.9x faster convergence and better accuracy on 3D shape assembly benchmarks with zero inference overhead.
Entropy-Gradient Grounding: Training-Free Evidence Retrieval in Vision-Language Models cs.CV · 2026-04-09 · unverdicted · none · ref 9
Entropy-gradient grounding uses model uncertainty to retrieve evidence regions in VLMs, improving performance on detail-critical and compositional tasks across multiple architectures.
C3G: Learning Compact 3D Representations with 2K Gaussians cs.CV · 2025-12-03 · unverdicted · none · ref 21
C3G creates compact 3D Gaussian representations with 2K points by guiding placement via learnable tokens that aggregate multi-view features through attention, yielding better efficiency and performance than dense methods.

Unifying feature and cost aggregation with transformers for semantic and visual correspondence

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer