Sparsemm: Head sparsity emerges from visual concept re- sponses in mllms

Jiahui Wang, Zuyan Liu, Yongming Rao, Jiwen Lu · 2025 · arXiv 2506.05344

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

AIA: Rethinking Architecture Decoupling Strategy In Unified Multimodal Model

cs.CV · 2025-11-27 · unverdicted · novelty 7.0

AIA loss teaches unified multimodal models task-specific cross-modal attention patterns to reduce conflicts between image understanding and generation without architecture decoupling.

Vision-Core Guided Contrastive Learning for Balanced Multi-modal Prognosis Prediction of Stroke

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

A tri-modal model with LLM-generated text from MRIs and a vision-guided dual alignment fusion module achieves state-of-the-art performance on real-world ischemic stroke prognosis prediction.

HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

HybridKV reduces KV cache memory by up to 7.9x and speeds decoding by 1.52x in MLLMs with almost no performance loss by classifying heads into static and dynamic types and compressing them differently.

citing papers explorer

Showing 3 of 3 citing papers.

AIA: Rethinking Architecture Decoupling Strategy In Unified Multimodal Model cs.CV · 2025-11-27 · unverdicted · none · ref 37
AIA loss teaches unified multimodal models task-specific cross-modal attention patterns to reduce conflicts between image understanding and generation without architecture decoupling.
Vision-Core Guided Contrastive Learning for Balanced Multi-modal Prognosis Prediction of Stroke cs.CV · 2026-05-14 · unverdicted · none · ref 30
A tri-modal model with LLM-generated text from MRIs and a vision-guided dual alignment fusion module achieves state-of-the-art performance on real-world ischemic stroke prognosis prediction.
HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference cs.AI · 2026-04-07 · unverdicted · none · ref 4
HybridKV reduces KV cache memory by up to 7.9x and speeds decoding by 1.52x in MLLMs with almost no performance loss by classifying heads into static and dynamic types and compressing them differently.

Sparsemm: Head sparsity emerges from visual concept re- sponses in mllms

fields

years

verdicts

representative citing papers

citing papers explorer