Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N · 2023

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Large Language Model as Token Compressor and Decompressor

cs.CL · 2026-03-26 · unverdicted · novelty 6.0

A pretrained LLM is adapted via LoRA fine-tuning into a content-adaptive compressor that maps long texts to compact variable-length Z-token sequences while preserving reconstruction quality and downstream performance.

Gaze-Regularized Vision-Language-Action Models for Robotic Manipulation

cs.CV · 2026-03-24 · unverdicted · novelty 6.0

Gaze regularization aligns VLA attention with human visual patterns via KL divergence on patch distributions, yielding 4-12% gains on manipulation benchmarks.

Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers

cs.CV · 2025-11-18 · unverdicted · novelty 6.0

Co-Me distills a confidence predictor to selectively merge low-confidence tokens in visual geometric transformers, delivering up to 21.5x speedup on VGGT and 20.4x on Pi3 while preserving spatial coverage and performance.

citing papers explorer

Showing 3 of 3 citing papers.

Large Language Model as Token Compressor and Decompressor cs.CL · 2026-03-26 · unverdicted · none · ref 36
A pretrained LLM is adapted via LoRA fine-tuning into a content-adaptive compressor that maps long texts to compact variable-length Z-token sequences while preserving reconstruction quality and downstream performance.
Gaze-Regularized Vision-Language-Action Models for Robotic Manipulation cs.CV · 2026-03-24 · unverdicted · none · ref 46
Gaze regularization aligns VLA attention with human visual patterns via KL divergence on patch distributions, yielding 4-12% gains on manipulation benchmarks.
Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers cs.CV · 2025-11-18 · unverdicted · none · ref 33
Co-Me distills a confidence predictor to selectively merge low-confidence tokens in visual geometric transformers, delivering up to 21.5x speedup on VGGT and 20.4x on Pi3 while preserving spatial coverage and performance.

Gomez, Lukasz Kaiser, and Illia Polosukhin

fields

years

verdicts

representative citing papers

citing papers explorer