AICA-Bench evaluates 23 VLMs on affective image analysis, identifies weak intensity calibration and shallow descriptions as limitations, and proposes training-free Grounded Affective Tree Prompting to improve performance.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3verdicts
UNVERDICTED 3representative citing papers
ReGATE introduces a teacher-student adaptive token elision method that reduces training tokens to 38% while matching or exceeding baseline accuracy on multimodal benchmarks.
The work identifies a small set of attention heads in VLMs that mediate conflicts between parametric knowledge and visual input, and shows that intervening on them steers model behavior while attention patterns provide precise image-region attribution.
citing papers explorer
-
AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis
AICA-Bench evaluates 23 VLMs on affective image analysis, identifies weak intensity calibration and shallow descriptions as limitations, and proposes training-free Grounded Affective Tree Prompting to improve performance.
-
ReGATE: Learning Faster and Better with Fewer Tokens in MLLMs
ReGATE introduces a teacher-student adaptive token elision method that reduces training tokens to 38% while matching or exceeding baseline accuracy on multimodal benchmarks.
-
When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models
The work identifies a small set of attention heads in VLMs that mediate conflicts between parametric knowledge and visual input, and shows that intervening on them steers model behavior while attention patterns provide precise image-region attribution.