Memory-efficient visual au- toregressive modeling with scale-aware kv cache compression

Kunjun Li, Zigeng Chen, Cheng-Yen Yang, Jenq-Neng Hwang · 2025 · arXiv 2505.19602

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

OScaR mitigates token norm imbalance via canalized rotation and omni-token scaling to enable near-lossless INT2 KV cache quantization with up to 3x decoding speedup and 5.3x memory reduction.

Visual Implicit Autoregressive Modeling

cs.CV · 2026-05-02 · unverdicted · novelty 6.0

VIAR embeds implicit equilibrium layers in visual autoregressive models to achieve ImageNet FID 2.16 with 38.4% of VAR parameters and controllable inference compute.

citing papers explorer

Showing 2 of 2 citing papers.

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond cs.LG · 2026-05-19 · unverdicted · none · ref 29
OScaR mitigates token norm imbalance via canalized rotation and omni-token scaling to enable near-lossless INT2 KV cache quantization with up to 3x decoding speedup and 5.3x memory reduction.
Visual Implicit Autoregressive Modeling cs.CV · 2026-05-02 · unverdicted · none · ref 29
VIAR embeds implicit equilibrium layers in visual autoregressive models to achieve ImageNet FID 2.16 with 38.4% of VAR parameters and controllable inference compute.

Memory-efficient visual au- toregressive modeling with scale-aware kv cache compression

fields

years

verdicts

representative citing papers

citing papers explorer