FAT decomposes structured prediction into specialist hypothesis generation and foundation-model proxy reasoning, yielding consistent gains over baselines on detection, trajectory, and segmentation tasks.
Vision language models in autonomous driving: A survey and outlook
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
Rethinking Foundation Model Collaboration: Enhancing Specialized Models through Proxy Task Reasoning
FAT decomposes structured prediction into specialist hypothesis generation and foundation-model proxy reasoning, yielding consistent gains over baselines on detection, trajectory, and segmentation tasks.
- Spatial Reasoning via Modality Switching Between Language and Symbolic Representation