A hierarchical VQA system aggregates model answers into weighted risk scores that produce four-category safety event maps for urban navigation, backed by a new 20-city dataset where generative MLLMs like Qwen-VL outperform classification models.
Learning transferable visual models from natural language supervi- sion
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
PDA-GAN with pixel discriminator bridges domain gap from inpainted posters to generate SOTA image-aware layouts on a new 60k-pair CGL-Dataset.
citing papers explorer
-
Urban Risk-Aware Navigation via VQA-Based Event Maps for People with Low Vision
A hierarchical VQA system aggregates model answers into weighted risk scores that produce four-category safety event maps for urban navigation, backed by a new 20-city dataset where generative MLLMs like Qwen-VL outperform classification models.
-
GAN-based Domain Adaptation for Image-aware Layout Generation in Advertising Poster Design
PDA-GAN with pixel discriminator bridges domain gap from inpainted posters to generate SOTA image-aware layouts on a new 60k-pair CGL-Dataset.