SceneMiner shows that identity-preserving multi-task fine-tuning removes cross-task interference by zero-initializing new heads and freezing shared-stream parameters, enabling unified BEV scene mining with preserved original heads.
Nuscenes- qa: A multi-modal visual question answering benchmark for autonomous driving sce- nario
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
VeriDrive introduces a verifiable counterfactual supervision framework using a Perception-Evaluation-Revision chain and validator-guided correction to generate cost-efficient structured data for vision-language driving models, showing metric gains on nuScenes.
citing papers explorer
-
SceneMiner: Identity-Preserving Multi-Task Fine-Tuning for Unified BEV Scene Mining
SceneMiner shows that identity-preserving multi-task fine-tuning removes cross-task interference by zero-initializing new heads and freezing shared-stream parameters, enabling unified BEV scene mining with preserved original heads.
-
VeriDrive: Verifiable Counterfactual Supervision for Cost-Efficient Vision-Language Planning
VeriDrive introduces a verifiable counterfactual supervision framework using a Perception-Evaluation-Revision chain and validator-guided correction to generate cost-efficient structured data for vision-language driving models, showing metric gains on nuScenes.