An explanatory book that supplies a clear mental map and intuition for how Vision-Language Models combine vision and language capabilities.
Emerging Properties in Self-Supervised Vision Transformers , booktitle =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
citing papers explorer
-
From Pixels to Prompts: Vision-Language Models
An explanatory book that supplies a clear mental map and intuition for how Vision-Language Models combine vision and language capabilities.
- Seeing SDG 6 from space: local-scale monitoring of piped water and sewage system access across Africa using satellite imagery and self-supervised learning