ForAug: Mitigating Biases in Image Classification via Controlled Image Compositions
read the original abstract
Large-scale image classification datasets exhibit strong compositional biases: objects tend to be centered, appear at characteristic scales, and co-occur with class-specific context. By exploiting such biases, models attain high in-distribution accuracy but remain fragile under distribution shifts. To address this issue, we introduce ForAug, a controlled composition augmentation scheme that factorizes each training image into a foreground object and a background and recombines them to explicitly manipulate object position, object scale, and background identity. ForAug uses off-the-shelf segmentation and inpainting models to (i) extract the foreground and synthesize a neutral background, and (ii) paste the foreground onto diverse neutral backgrounds before applying standard strong augmentation policies. Compared to conventional augmentations and content-mixing methods, our factorization provides direct control knobs that break foreground-background correlations. Across 10 architectures, ForAug improves ImageNet top-1 accuracy by up to 6 percentage points (p.p.) and yields gains of up to 7.3 p.p. on fine-grained downstream datasets. Moreover, the same control knobs enable targeted diagnostic tests: we quantify background reliance, foreground focus, center bias, and size bias via controlled background swaps and position/scale sweeps, and show that training with ForAug substantially reduces these shortcut behaviors and significantly increases accuracy on standard distribution-shift benchmarks by up to $19$ p.p. Our code and dataset are publicly available at https://github.com/tobna/ForAug.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Seeing Through Circuits: Faithful Mechanistic Interpretability for Vision Transformers
Edge-based circuits in vision transformers can be automatically recovered to explain and steer model computations for classification and adversarial behaviors.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.