Circle-RoPE: Cone-like decoupled rotary positional embedding for large vision-language models.arXiv preprint arXiv:2505.16416

Wang, C · arXiv 2505.16416

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Mitigating Mask Prior Drift and Positional Attention Collapse in Large Diffusion Vision-Language Models

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Mask prior drift and positional attention collapse cause failures in LDVLMs for long generations, fixed by training-free Mask Prior Suppression and Monotonic RoPE Scaling.

MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

MODIX dynamically rescales positional indices in VLMs using intra-modal covariance-based entropy and inter-modal alignment scores to allocate finer granularity to informative content.

citing papers explorer

Showing 2 of 2 citing papers.

Mitigating Mask Prior Drift and Positional Attention Collapse in Large Diffusion Vision-Language Models cs.CV · 2026-05-14 · unverdicted · none · ref 18
Mask prior drift and positional attention collapse cause failures in LDVLMs for long generations, fixed by training-free Mask Prior Suppression and Monotonic RoPE Scaling.
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models cs.CV · 2026-04-14 · unverdicted · none · ref 39
MODIX dynamically rescales positional indices in VLMs using intra-modal covariance-based entropy and inter-modal alignment scores to allocate finer granularity to informative content.

Circle-RoPE: Cone-like decoupled rotary positional embedding for large vision-language models.arXiv preprint arXiv:2505.16416

fields

years

verdicts

representative citing papers

citing papers explorer