pith. sign in

arxiv: 2507.19064 · v2 · pith:4CNNQRV4new · submitted 2025-07-25 · 💻 cs.CV

Negation-Aware Test-Time Adaptation for Vision-Language Models

classification 💻 cs.CV
keywords negationneatdatamodelsvlmsadaptationfalsemethod
0
0 comments X
read the original abstract

In this paper, we study a practical but less-touched problem in Vision-Language Models (VLMs), \ie, negation understanding. Specifically, many real-world applications require models to explicitly identify what is false or non-existent, \eg, radiologists may search for images that exclude specific conditions. Despite the impressive transferability of VLMs through large-scale training, they suffer from a critical limitation that fails to handle negation. To address this challenge, existing methods attribute its root cause to the scarcity of negation training data and propose to fine-tune VLMs on massive data containing explicit negation. Undoubtedly, such data-centric solutions demand substantial data and computational resources, limiting their sustainable widespread adoption. To tackle negation in a low-carbon manner, we empirically observe that the key obstacle lies in the dual-concept shifts between the affirmation and negation distributions. Therefore, we propose a Negation-Aware Test-Time Adaptation (NEAT) method to efficiently adjust distribution-related parameters during inference. In brief, NEAT can reduce distribution shift in consistent semantics while eliminating false distributional consistency in unrelated semantics. Extensive experiments on the various negation understanding tasks verify the effectiveness of the proposed method. Remarkably, with less than 0.01\% of trainable parameters, NEAT achieves comparable or superior performance to state-of-the-art post-training approaches. Our code is available at https://github.com/hhc1997/NEAT.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. HANCLIP: A Family of Hyperbolic Angular Negation Vision Language Models

    cs.CV 2026-06 unverdicted novelty 6.0

    HANCLIP restructures VLM embeddings with hyperbolic space and angular negation objectives to raise negation sensitivity on NegBench while keeping standard retrieval and classification performance.