Radar encoders aligned to frozen SigLIP embeddings enable weather-robust scene captioning via a frozen VLM with 7M trainable parameters, outperforming cameras on held-out adverse-weather sequences in K-RADAR.
LoRA: Low-rank adaptation of large language models
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
The paper compiles hardware-software co-design techniques including mixed-precision quantization, structural pruning, speculative decoding, and transformer accelerators to speed up multimodal foundation models, with examples in medical and code tasks.
citing papers explorer
-
Weather-Robust Scene Semantics with Vision-Aligned 4D Radar
Radar encoders aligned to frozen SigLIP embeddings enable weather-robust scene captioning via a frozen VLM with 7M trainable parameters, outperforming cameras on held-out adverse-weather sequences in K-RADAR.
-
Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models
The paper compiles hardware-software co-design techniques including mixed-precision quantization, structural pruning, speculative decoding, and transformer accelerators to speed up multimodal foundation models, with examples in medical and code tasks.