LLMind uses bio-inspired non-uniform sampling via a Mobius module and closed-loop semantic feedback to retain 82-97% of full-resolution VLM performance with only 1-5% of pixels on VQA benchmarks.
Peripheral vision transformer.Advances in Neural Informa- tion Processing Systems, 35:32097–32111, 2022
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
PFGNet introduces a frequency-guided peripheral gating block in a pure convolutional architecture to enable adaptive receptive fields for efficient spatiotemporal prediction with fewer parameters than prior methods.
citing papers explorer
-
LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models
LLMind uses bio-inspired non-uniform sampling via a Mobius module and closed-loop semantic feedback to retain 82-97% of full-resolution VLM performance with only 1-5% of pixels on VQA benchmarks.
-
PFGNet: A Fully Convolutional Frequency-Guided Peripheral Gating Network for Efficient Spatiotemporal Predictive Learning
PFGNet introduces a frequency-guided peripheral gating block in a pure convolutional architecture to enable adaptive receptive fields for efficient spatiotemporal prediction with fewer parameters than prior methods.