Recognition: unknown
MARCO: Navigating the Unseen Space of Semantic Correspondence
Pith reviewed 2026-05-10 05:40 UTC · model grok-4.3
The pith
MARCO uses self-distillation on a DINOv2 backbone to turn sparse keypoints into dense semantic correspondences that generalize to unseen points and categories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Building upon DINOv2, MARCO introduces a unified model for generalizable correspondence driven by a novel training framework. By coupling a coarse-to-fine objective that refines spatial precision with a self-distillation framework which expands sparse supervision beyond annotated regions, the approach transforms a handful of keypoints into dense, semantically coherent correspondences while remaining 3x smaller and 10x faster than diffusion-based alternatives.
What carries the argument
The self-distillation framework that uses the model's own predictions to expand sparse keypoint annotations into dense, semantically coherent correspondence maps, combined with a coarse-to-fine refinement objective.
If this is right
- New state-of-the-art results on SPair-71k, AP-10K, and PF-PASCAL, with the largest gains at fine-grained thresholds such as PCK@0.01.
- Strongest reported generalization to unseen keypoints, reaching +5.1 on SPair-U.
- Strongest reported generalization to unseen categories, reaching +4.7 on MP-100.
- Model remains 3 times smaller and runs 10 times faster than competing diffusion-based approaches.
Where Pith is reading between the lines
- Self-distillation may serve as a general technique to densify supervision in other sparse-label vision tasks without collecting additional annotations.
- The results suggest that carefully designed training objectives on DINOv2 features can replace the need for explicit diffusion backbones in correspondence problems.
- The same densification process could be tested on video sequences or 3D point clouds to produce temporally or spatially consistent matches.
- The reduced size and speed make accurate semantic correspondence feasible for real-time applications on edge devices.
Load-bearing premise
The self-distillation step expands sparse keypoint supervision into dense correspondences without introducing systematic errors or confirmation bias from the teacher predictions.
What would settle it
Train the model on SPair-71k and evaluate PCK@0.01 and generalization metrics on a dataset whose keypoints and object categories are completely disjoint from all training data; if the reported gains over prior methods disappear, the central claim does not hold.
Figures
read the original abstract
Recent advances in semantic correspondence rely on dual-encoder architectures, combining DINOv2 with diffusion backbones. While accurate, these billion-parameter models generalize poorly beyond training keypoints, revealing a gap between benchmark performance and real-world usability, where queried points rarely match those seen during training. Building upon DINOv2, we introduce MARCO, a unified model for generalizable correspondence driven by a novel training framework that enhances both fine-grained localization and semantic generalization. By coupling a coarse-to-fine objective that refines spatial precision with a self-distillation framework, which expands sparse supervision beyond annotated regions, our approach transforms a handful of keypoints into dense, semantically coherent correspondences. MARCO sets a new state of the art on SPair-71k, AP-10K, and PF-PASCAL, with gains that amplify at fine-grained localization thresholds (+8.9 PCK@0.01), strongest generalization to unseen keypoints (+5.1, SPair-U) and categories (+4.7, MP-100), while remaining 3x smaller and 10x faster than diffusion-based approaches. Code is available at https://github.com/visinf/MARCO .
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MARCO, a DINOv2-based unified model for semantic correspondence. It proposes a coarse-to-fine objective coupled with a self-distillation framework that expands sparse keypoint annotations into dense, semantically coherent pseudo-labels. The work claims new state-of-the-art results on SPair-71k, AP-10K, and PF-PASCAL, with amplified gains at fine-grained thresholds (+8.9 PCK@0.01), improved generalization to unseen keypoints (+5.1 PCK on SPair-U) and categories (+4.7 on MP-100), and efficiency benefits (3x smaller and 10x faster than diffusion-based approaches). Code is released.
Significance. If the central claims hold after validation, the work would represent a meaningful advance in semantic correspondence by addressing poor generalization beyond training keypoints, a key limitation for real-world use. The efficiency gains relative to large diffusion models could increase accessibility, and the explicit code release at https://github.com/visinf/MARCO is a clear strength supporting reproducibility.
major comments (2)
- The self-distillation framework (described as expanding sparse supervision beyond annotated regions via coupling to the coarse-to-fine objective) is load-bearing for the generalization results (+5.1 PCK on SPair-U unseen keypoints and +4.7 on MP-100 unseen categories). However, the manuscript provides no independent quantitative measurement of pseudo-label fidelity or confirmation bias outside the original sparse set, leaving open the possibility that teacher (DINOv2) errors are propagated rather than true semantic generalization being achieved.
- The experimental results section reports SOTA gains including +8.9 PCK@0.01 without error bars, ablation studies isolating the self-distillation component, or full details on baseline implementations and statistical significance testing. This undermines assessment of whether the fine-grained and out-of-distribution improvements are robust or could be inflated.
minor comments (2)
- The abstract states efficiency claims (3x smaller, 10x faster) but does not specify the exact parameter counts or runtime measurements used for comparison with diffusion-based approaches.
- Ensure that all tables and figures in the full manuscript include clear captions, units, and any necessary statistical details for the PCK metrics.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major comment point by point below and describe the revisions we will incorporate to strengthen the presentation and validation of our results.
read point-by-point responses
-
Referee: The self-distillation framework (described as expanding sparse supervision beyond annotated regions via coupling to the coarse-to-fine objective) is load-bearing for the generalization results (+5.1 PCK on SPair-U unseen keypoints and +4.7 on MP-100 unseen categories). However, the manuscript provides no independent quantitative measurement of pseudo-label fidelity or confirmation bias outside the original sparse set, leaving open the possibility that teacher (DINOv2) errors are propagated rather than true semantic generalization being achieved.
Authors: We agree that direct quantitative assessment of pseudo-label fidelity would provide stronger support for the generalization claims. The performance improvements on unseen keypoints and categories are larger than those achieved by the DINOv2 teacher alone, which suggests the self-distillation process contributes to genuine semantic expansion rather than simple error propagation. Nevertheless, to address this concern explicitly, we will add a new analysis subsection that measures pseudo-label quality against available ground-truth annotations on held-out regions and includes controlled ablations to quantify any confirmation bias. revision: yes
-
Referee: The experimental results section reports SOTA gains including +8.9 PCK@0.01 without error bars, ablation studies isolating the self-distillation component, or full details on baseline implementations and statistical significance testing. This undermines assessment of whether the fine-grained and out-of-distribution improvements are robust or could be inflated.
Authors: We acknowledge that reporting error bars, isolating the self-distillation contribution, providing fuller baseline implementation details, and including statistical significance tests would improve the robustness assessment. The current manuscript contains ablations on the overall framework and coarse-to-fine objective, but we will expand the experimental section in revision to report standard deviations over multiple random seeds, add a dedicated ablation table isolating self-distillation, include precise reproduction details for all baselines, and apply appropriate statistical tests (e.g., paired t-tests) to the reported gains at fine thresholds and on out-of-distribution splits. revision: yes
Circularity Check
No circularity: empirical training framework with independent benchmark validation
full rationale
The paper presents MARCO as an empirical method that augments DINOv2 via a self-distillation framework and coarse-to-fine objective to expand sparse keypoint supervision into dense correspondences. No equations, derivations, or self-citations are shown that reduce the reported PCK gains, generalization metrics, or efficiency claims to quantities defined by the inputs or by construction. The core claims rest on released code and external benchmark results rather than tautological redefinitions or fitted-parameter predictions. This is a standard self-contained empirical contribution.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption DINOv2 provides a strong semantic feature backbone for correspondence tasks
Reference graph
Works this paper leans on
-
[1]
Deep ViT features as dense visual descriptors
Shir Amir, Yossi Gandelsman, Shai Bagon, and Tali Dekel. Deep ViT features as dense visual descriptors. In ECCV Workshop on What is Motion For?, 2022
2022
-
[2]
Emerging properties in self-supervised vision transformers
Mathilde Caron, Hugo Touvron, Ishan Misra, Herv \'e J \'e gou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In ICCV, pages 9650--9660, 2021
2021
-
[3]
On the over-smoothing problem of CNN based disparity estimation
Chuangrong Chen, Xiaozhi Chen, and Hui Cheng. On the over-smoothing problem of CNN based disparity estimation. In CVPR, pages 8997--9005, 2019
2019
-
[4]
AdaptFormer : A dapting vision transformers for scalable visual recognition
Shoufa Chen, Chongjian Ge, Zhan Tong, Jiangliu Wang, Yibing Song, Jue Wang, and Ping Luo. AdaptFormer : A dapting vision transformers for scalable visual recognition. In NeurIPS, volume 35, pages 16664--16678, 2022
2022
-
[5]
Cats++: Boosting cost aggregation with convolutions and transformers
Seokju Cho, Sunghwan Hong, and Seungryong Kim. Cats++: Boosting cost aggregation with convolutions and transformers. IEEE Trans. Pattern Anal. Mach. Intell., 45 0 (6): 0 7174--7194, 2023
2023
-
[6]
SANSA : Unleashing the hidden semantics in SAM2 for few-shot segmentation
Claudia Cuttano, Gabriele Trivigno, Giuseppe Averta, and Carlo Masone. SANSA : Unleashing the hidden semantics in SAM2 for few-shot segmentation. In NeurIPS, volume 38, 2025
2025
-
[7]
Do it yourself: Learning semantic correspondence from pseudo-labels
Olaf D \"u nkel, Thomas Wimmer, Christian Theobalt, Christian Rupprecht, and Adam Kortylewski. Do it yourself: Learning semantic correspondence from pseudo-labels. In ICCV, pages 5834--5844, 2025
2025
-
[8]
o kman, M rten Wadenb \
Johan Edstedt, Qiyu Sun, Georg B \"o kman, M rten Wadenb \"a ck, and Michael Felsberg. RoMa : R obust dense feature matching. In CVPR, pages 19790--19800, 2024
2024
-
[9]
Probing the 3D awareness of visual foundation models
Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, and Varun Jampani. Probing the 3D awareness of visual foundation models. In CVPR, pages 21795--21806, 2024
2024
-
[10]
Distillation of diffusion features for semantic correspondence
Frank Fundel, Johannes Schusterbauer, Vincent Tao Hu, and Bj \"o rn Ommer. Distillation of diffusion features for semantic correspondence. In WACV, pages 6762--6774, 2025
2025
-
[11]
Bias-compensated integral regression for human pose estimation
Kerui Gu, Linlin Yang, Michael Bi Mi, and Angela Yao. Bias-compensated integral regression for human pose estimation. IEEE Trans. Pattern Anal. Mach. Intell., 45 0 (9): 0 10687--10702, 2023
2023
-
[12]
Goldman, and Dani Lischinski
Yoav HaCohen, Eli Shechtman, Dan B. Goldman, and Dani Lischinski. N on-rigid dense correspondence with applications for image enhancement. In SIGGRAPH, 2011
2011
-
[13]
Proposal Flow : S emantic correspondences from object proposals
Bumsub Ham, Minsu Cho, Cordelia Schmid, and Jean Ponce. Proposal Flow : S emantic correspondences from object proposals. IEEE Trans. Pattern Anal. Mach. Intell., 40 0 (7): 0 1711--1725, 2017
2017
-
[14]
GECO : G eometrically consistent embedding with lightspeed inference
Regine Hartwig, Dominik Muhle, Riccardo Marin, and Daniel Cremers. GECO : G eometrically consistent embedding with lightspeed inference. In ICCV, pages 9309--9319, 2025
2025
-
[15]
Cost aggregation with 4D convolutional S win transformer for few-shot segmentation
Sunghwan Hong, Seokju Cho, Jisu Nam, Stephen Lin, and Seungryong Kim. Cost aggregation with 4D convolutional S win transformer for few-shot segmentation. In ECCV, volume 29, pages 108--126, 2022 a
2022
-
[16]
Neural matching fields: I mplicit representation of matching fields for visual correspondence
Sunghwan Hong, Jisu Nam, Seokju Cho, Susung Hong, Sangryul Jeon, Dongbo Min, and Seungryong Kim. Neural matching fields: I mplicit representation of matching fields for visual correspondence. In NeurIPS, volume 35, pages 13512--13526, 2022 b
2022
-
[17]
LoRa : L ow-rank adaptation of large language models
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. LoRa : L ow-rank adaptation of large language models. In ICLR, 2022
2022
-
[18]
Learning semantic correspondence with sparse annotations
Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, and Abhinav Shrivastava. Learning semantic correspondence with sparse annotations. In ECCV, volume 14, pages 267--284, 2022
2022
-
[19]
Doduo : D ense visual correspondence from unsupervised semantic-aware flow
Zhenyu Jiang, Hanwen Jiang, and Yuke Zhu. Doduo : D ense visual correspondence from unsupervised semantic-aware flow. In ICRA, pages 12420--12427, 2024
2024
-
[20]
S emantic attribute matching networks
Seungryong Kim, Dongbo Min, Somi Jeong, Sunok Kim, Sangryul Jeon, and Kwanghoon Sohn. S emantic attribute matching networks. In CVPR, pages 12339--12348, 2019
2019
-
[21]
Kingma and Jimmy Ba
Diederik P. Kingma and Jimmy Ba. Adam : A method for stochastic optimization. In ICLR, pages 1--13, 2015
2015
-
[22]
Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. Segment A nything. In ICCV, pages 4015--4026, 2023
2023
-
[23]
T he functional correspondence problem
Zihang Lai, Senthil Purushwalkam, and Abhinav Gupta. T he functional correspondence problem. In CVPR, pages 15772--15781, 2021
2021
-
[24]
SFNet : L earning object-aware semantic correspondence
Junghyup Lee, Dohyung Kim, Jean Ponce, and Bumsub Ham. SFNet : L earning object-aware semantic correspondence. In CVPR, pages 2278--2287, 2019
2019
-
[25]
Costain, Henry Howard-Jenkins, and Victor Prisacariu
Shuda Li, Kai Han, Theo W. Costain, Henry Howard-Jenkins, and Victor Prisacariu. Correspondence networks with adaptive neighbourhood consensus. In CVPR, pages 10196--10205, 2020
2020
-
[26]
Probabilistic model distillation for semantic correspondence
Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, and Zicheng Liu. Probabilistic model distillation for semantic correspondence. In CVPR, pages 7505--7514, 2021
2021
-
[27]
SimSC : A simple framework for semantic correspondence with temperature learning
Xinghui Li, Kai Han, Xingchen Wan, and Victor Adrian Prisacariu. SimSC : A simple framework for semantic correspondence with temperature learning. arXiv:2305.02385 [cs.CV], 2023
-
[28]
SD4Match : L earning to prompt S table D iffusion model for semantic matching
Xinghui Li, Jingyi Lu, Kai Han, and Victor Adrian Prisacariu. SD4Match : L earning to prompt S table D iffusion model for semantic matching. In CVPR, pages 27558--27568, 2024
2024
-
[29]
SIFT F low: D ense correspondence across scenes and its applications
Ce Liu, Jenny Yuen, and Antonio Torralba. SIFT F low: D ense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell., 33 0 (5): 0 978--994, 2011
2011
-
[30]
Diffusion Hyperfeatures : S earching through time and space for semantic correspondence
Grace Luo, Lisa Dunlap, Dong Huk Park, Aleksander Holynski, and Trevor Darrell. Diffusion Hyperfeatures : S earching through time and space for semantic correspondence. In NeurIPS, volume 36, pages 47500--47510, 2023
2023
-
[31]
Improving semantic correspondence with viewpoint-guided spherical maps
Octave Mariotti, Oisin Mac Aodha, and Hakan Bilen. Improving semantic correspondence with viewpoint-guided spherical maps. In CVPR, pages 19521--19530, 2024
2024
-
[32]
Jamais V u: E xposing the generalization gap in supervised semantic correspondence
Octave Mariotti, Zhipeng Du, Yash Bhalgat, Oisin Mac Aodha, and Hakan Bilen. Jamais V u: E xposing the generalization gap in supervised semantic correspondence. In NeurIPS, volume 38, 2025
2025
-
[33]
Convolutional H ough matching networks
Juhong Min and Minsu Cho. Convolutional H ough matching networks. In CVPR, pages 2940--2950, 2021
2021
-
[34]
Spair-71k: A large-scale benchmark for semantic correspondence.arXiv preprint arXiv:1908.10543,
Juhong Min, Jongmin Lee, Jean Ponce, and Minsu Cho. SPair-71k : A large-scale benchmark for semantic correspondence. arXiv:1908.10543 [cs.CV], 2019
-
[35]
ESCAPE : E ncoding S uper-keypoints for C ategory- A gnostic P ose E stimation
Khoi Duc Nguyen, Chen Li, and Gim Hee Lee. ESCAPE : E ncoding S uper-keypoints for C ategory- A gnostic P ose E stimation. In CVPR, pages 23491--23500, 2024
2024
-
[36]
Neural congealing: A ligning images to a joint semantic atlas
Dolev Ofri-Amar, Michal Geyer, Yoni Kasten, and Tali Dekel. Neural congealing: A ligning images to a joint semantic atlas. In CVPR, pages 19403--19412, 2023
2023
-
[37]
DINO v2: L earning robust visual features without supervision
Maxime Oquab, Timoth \'e e Darcet, Th \'e o Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. DINO v2: L earning robust visual features without supervision. Trans. Mach. Learn. Res., 2024
2024
-
[38]
The sampling- G aussian for stereo matching
Baiyu Pan, Bowen Yao, Jianxin Pang, Jun Cheng, et al. The sampling- G aussian for stereo matching. In ICLR, 2025
2025
-
[39]
Neighbourhood consensus networks
Ignacio Rocco, Mircea Cimpoi, Relja Arandjelovi \'c , Akihiko Torii, Tomas Pajdla, and Josef Sivic. Neighbourhood consensus networks. In NeurIPS, volume 31, pages 1658--1669, 2018
2018
-
[40]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj \"o rn Ommer. High-resolution image synthesis with latent diffusion models. In CVPR, pages 10684--10695, 2022
2022
-
[41]
LoFTR : D etector-free local feature matching with transformers
Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. LoFTR : D etector-free local feature matching with transformers. In CVPR, pages 8922--8931, 2021
2021
-
[42]
Pixel-level semantic correspondence through layout-aware representation learning and multi-scale matching integration
Yixuan Sun, Zhangyue Yin, Haibo Wang, Yan Wang, Xipeng Qiu, Weifeng Ge, and Wenqiang Zhang. Pixel-level semantic correspondence through layout-aware representation learning and multi-scale matching integration. In CVPR, pages 17047--17056, 2024
2024
-
[43]
Emergent correspondence from image diffusion
Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, and Bharath Hariharan. Emergent correspondence from image diffusion. In NeurIPS, volume 36, pages 1363--1389, 2023
2023
-
[44]
GLU-Net : G lobal-local universal network for dense flow and correspondences
Prune Truong, Martin Danelljan, and Radu Timofte. GLU-Net : G lobal-local universal network for dense flow and correspondences. In CVPR, pages 6258--6268, 2020
2020
-
[45]
Splicing ViT features for semantic appearance transfer
Narek Tumanyan, Omer Bar-Tal, Shai Bagon, and Tali Dekel. Splicing ViT features for semantic appearance transfer. In CVPR, pages 10748--10757, 2022
2022
-
[46]
DINO-Tracker : T aming DINO for self-supervised point tracking in a single video
Narek Tumanyan, Assaf Singer, Shai Bagon, and Tali Dekel. DINO-Tracker : T aming DINO for self-supervised point tracking in a single video. In ECCV, volume 26, pages 367--385, 2024
2024
-
[47]
COVE : U nleashing the D iffusion F eature correspondence for consistent video editing
Jiangshan Wang, Yue Ma, Jiayi Guo, Yicheng Xiao, Gao Huang, and Xiu Li. COVE : U nleashing the D iffusion F eature correspondence for consistent video editing. In NeurIPS, volume 37, pages 96541--96565, 2024
2024
-
[48]
Pose for everything: T owards category-agnostic pose estimation
Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, and Xiaogang Wang. Pose for everything: T owards category-agnostic pose estimation. In ECCV, volume 6, pages 398--416, 2022
2022
-
[49]
MATCHA : T owards matching anything
Fei Xue, Sven Elflein, Laura Leal-Taix \'e , and Qunjie Zhou. MATCHA : T owards matching anything. In CVPR, pages 27081--27091, 2025
2025
-
[50]
Chiao-An Yang and Raymond A. Yeh. Heatmap regression without soft-argmax for facial landmark detection. In CVPR, pages 28729--28739, 2025
2025
-
[51]
AP-10K : A benchmark for animal pose estimation in the wild
Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, and Dacheng Tao. AP-10K : A benchmark for animal pose estimation in the wild. In NeurIPS Datasets and Benchmarks, 2021
2021
-
[52]
Distribution-aware coordinate representation for human pose estimation
Feng Zhang, Xiatian Zhu, Hanbin Dai, Mao Ye, and Ce Zhu. Distribution-aware coordinate representation for human pose estimation. In CVPR, pages 7091--7100, 2020
2020
-
[53]
A T ale of two features: S table D iffusion complements DINO for zero-shot semantic correspondence
Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. A T ale of two features: S table D iffusion complements DINO for zero-shot semantic correspondence. In NeurIPS, volume 36, pages 45533--45547, 2023
2023
-
[54]
Telling left from right: I dentifying geometry-aware semantic correspondence
Junyi Zhang, Charles Herrmann, Junhwa Hur, Eric Chen, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. Telling left from right: I dentifying geometry-aware semantic correspondence. In CVPR, pages 3076--3085, 2024
2024
-
[55]
Li, Xinghui and Han, Kai and Wan, Xingchen and Prisacariu, Victor Adrian , journal=
-
[56]
Object discovery and representation networks , author=
-
[57]
Efficient Visual Pretraining With Contrastive Detection , author=
-
[58]
Wandel, Krispin and Wang, Hesheng , booktitle=cvpr-2025, pages=
2025
-
[59]
Nguyen, Khoi Duc and Li, Chen and Lee, Gim Hee , booktitle=cvpr-2024, pages=
2024
-
[60]
Kim, Seungryong and Min, Dongbo and Jeong, Somi and Kim, Sunok and Jeon, Sangryul and Sohn, Kwanghoon , booktitle=cvpr-2019, pages=
2019
-
[61]
Lai, Zihang and Purushwalkam, Senthil and Gupta, Abhinav , booktitle=cvpr-2021, pages=
2021
-
[62]
Wang, Jiangshan and Ma, Yue and Guo, Jiayi and Xiao, Yicheng and Huang, Gao and Li, Xiu , booktitle=neurips-2024, volume=
2024
-
[63]
Qian, Qiyang and Chen, Hansheng and Tomizuka, Masayoshi and Keutzer, Kurt and Wang, Qianqian and Xu, Chenfeng , booktitle=cvpr-2025, pages=
2025
-
[64]
Distillation of Diffusion Features for Semantic Correspondence , author=
-
[65]
Lee, Junghyup and Kim, Dohyung and Ponce, Jean and Ham, Bumsub , booktitle=cvpr-2019, pages=
2019
-
[66]
Semi-supervised learning of semantic correspondence with pseudo-labels , author=
-
[67]
Luo, Grace and Dunlap, Lisa and Park, Dong Huk and Holynski, Aleksander and Darrell, Trevor , booktitle=neurips-2023, volume=
2023
-
[68]
Learning Semantic Correspondence with Sparse Annotations , author=
-
[69]
Han, Kai and Rezende, Rafael S and Ham, Bumsub and Wong, Kwan-Yee K and Cho, Minsu and Schmid, Cordelia and Ponce, Jean , booktitle=cvpr-2017, pages=
2017
-
[70]
Probabilistic Model Distillation for Semantic Correspondence , author=
-
[71]
Pixelwise View Selection for Unstructured Multi-View Stereo , author=
-
[72]
Toft, Carl and Stenborg, Erik and Hammarstrand, Lars and Brynte, Lucas and Pollefeys, Marc and Sattler, Torsten and Kahl, Fredrik , title =
-
[73]
High-resolution image synthesis with latent diffusion models , author=
-
[74]
Bay, Herbert and Tuytelaars, Tinne and Van Gool, Luc , booktitle=eccv-2006, pages=
2006
-
[75]
Into the Rabbit Hull:
Fel, Thomas and Wang, Binxu and Lepori, Michael A and Kowal, Matthew and Lee, Andrew and Balestriero, Randall and Joseph, Sonia and Lubana, Ekdeep S and Konkle, Talia and Ba, Demba and others , journal=. Into the Rabbit Hull:
-
[76]
Distinctive Image Features from Scale-invariant Keypoints , author=
-
[77]
End-to-end Weakly-supervised Semantic Alignment , author=
-
[78]
Convolutional
Min, Juhong and Cho, Minsu , booktitle=cvpr-2021, pages=. Convolutional
2021
-
[79]
Correspondence Networks with Adaptive Neighbourhood Consensus , author=
-
[80]
Neighbourhood Consensus Networks , author=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.