LLaV A-NeXT: Stronger LLMs supercharge multimodal capabilities in the wild

Bo Li, Kaichen Zhang, Hao Zhang, Dong Guo, Renrui Zhang, Feng Li, Yuanhan Zhang, Ziwei Liu, Chunyuan Li · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Fre-Res: Frequency-Residual Video Token Compression for Efficient Video MLLMs

cs.CV · 2026-05-10 · unverdicted · novelty 5.0

Fre-Res compresses video tokens by preserving spatial anchors and representing temporal dynamics with low-frequency residual tokens derived from 1D-DCT on inter-frame residuals, plus a Spatial-Guided Absorber to reinject the information.

citing papers explorer

Showing 1 of 1 citing paper.

Fre-Res: Frequency-Residual Video Token Compression for Efficient Video MLLMs cs.CV · 2026-05-10 · unverdicted · none · ref 11
Fre-Res compresses video tokens by preserving spatial anchors and representing temporal dynamics with low-frequency residual tokens derived from 1D-DCT on inter-frame residuals, plus a Spatial-Guided Absorber to reinject the information.

LLaV A-NeXT: Stronger LLMs supercharge multimodal capabilities in the wild

fields

years

verdicts

representative citing papers

citing papers explorer