← back to paper
arxiv: 2605.10780 · 2 revisions
Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization