Laser reformulates visual reasoning via Dynamic Windowed Alignment Learning to maintain latent superposition of global features, delivering 5.03% average gains over Monet and over 97% fewer inference tokens on six benchmarks.
Llava-onevision: Easy visual task transfer.Transactions on Machine Learning Research, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Forest Before Trees: Latent Superposition for Efficient Visual Reasoning
Laser reformulates visual reasoning via Dynamic Windowed Alignment Learning to maintain latent superposition of global features, delivering 5.03% average gains over Monet and over 97% fewer inference tokens on six benchmarks.
- ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion