Unleashing Vision Transformer Potential In Image Quality Assessment via Global-Local Adaptive Interaction

· 2026 · cs.CV · arXiv 2605.17748

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

In the field of Blind Image Quality Assessment (BIQA), accurately predicting the perceptual quality of authentically distorted images remains highly challenging due to the diverse and complex distortions present in natural environments. Although existing methods have achieved notable accuracy, their scalability is often constrained by the high cost of subjective annotation and the limited size of available datasets. Recent advances in large-scale pre-trained vision models have introduced powerful semantic and representational capabilities, yet their application to IQA tasks is hindered by substantial computational demands and suboptimal fine-tuning efficiency. To overcome these limitations, we introduce the Global-Local Interaction Adapter (GLIA), a novel framework that effectively harnesses pre-trained Vision Transformers through a dual-stream feature extraction mechanism coupled with interactive global-local fusion. By jointly retaining global semantic information and fine-grained local details, our approach delivers superior prediction accuracy and robustness while requiring significantly fewer trainable parameters. Extensive experiments on multiple benchmarks validate the effectiveness and superiority of our approach.

representative citing papers

Unleashing Vision Transformer Potential In Image Quality Assessment via Global-Local Adaptive Interaction

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Proposes GLIA framework to adapt Vision Transformers for blind image quality assessment via dual-stream global-local interaction, claiming higher accuracy and robustness with reduced parameters.

citing papers explorer

Showing 1 of 1 citing paper.

Unleashing Vision Transformer Potential In Image Quality Assessment via Global-Local Adaptive Interaction cs.CV · 2026-05-18 · unverdicted · none · ref 5 · internal anchor
Proposes GLIA framework to adapt Vision Transformers for blind image quality assessment via dual-stream global-local interaction, claiming higher accuracy and robustness with reduced parameters.

Unleashing Vision Transformer Potential In Image Quality Assessment via Global-Local Adaptive Interaction

fields

years

verdicts

representative citing papers

citing papers explorer