pith. sign in

How do vision transformers work?

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

years

2026 10 2025 1

roles

background 2

polarities

background 2

clear filters

representative citing papers

CineMatte: Background Matting for Virtual Production and Beyond

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

CineMatte uses a cross-attention design on a Siamese DINOv3 ViT plus a pretrained upsampler to produce robust mattes for virtual production, backed by a new non-synthetic 4K VP dataset that supports camera motion.

Does Your ViT Still Need U-Net for Segmentation?

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

EoSeg shows that modern ViT backbones support accurate medical image segmentation without U-Net-style decoders via multi-level query modeling and learnable block fusion, with strong results on seven benchmarks.

Elastic Attention Cores for Scalable Vision Transformers

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VECA learns effective visual representations using core-periphery attention where patches interact exclusively via a resolution-invariant set of learned core embeddings, achieving linear O(N) complexity while maintaining competitive performance.

citing papers explorer

Showing 1 of 1 citing paper after filters.