pith. sign in

arxiv: 2503.09399 · v4 · pith:XWP7K4QOnew · submitted 2025-03-12 · 💻 cs.CV · cs.AI· cs.LG

ForAug: Mitigating Biases in Image Classification via Controlled Image Compositions

classification 💻 cs.CV cs.AIcs.LG
keywords foraugbackgroundforegroundimageaccuracybiasescontrolledobject
0
0 comments X
read the original abstract

Large-scale image classification datasets exhibit strong compositional biases: objects tend to be centered, appear at characteristic scales, and co-occur with class-specific context. By exploiting such biases, models attain high in-distribution accuracy but remain fragile under distribution shifts. To address this issue, we introduce ForAug, a controlled composition augmentation scheme that factorizes each training image into a foreground object and a background and recombines them to explicitly manipulate object position, object scale, and background identity. ForAug uses off-the-shelf segmentation and inpainting models to (i) extract the foreground and synthesize a neutral background, and (ii) paste the foreground onto diverse neutral backgrounds before applying standard strong augmentation policies. Compared to conventional augmentations and content-mixing methods, our factorization provides direct control knobs that break foreground-background correlations. Across 10 architectures, ForAug improves ImageNet top-1 accuracy by up to 6 percentage points (p.p.) and yields gains of up to 7.3 p.p. on fine-grained downstream datasets. Moreover, the same control knobs enable targeted diagnostic tests: we quantify background reliance, foreground focus, center bias, and size bias via controlled background swaps and position/scale sweeps, and show that training with ForAug substantially reduces these shortcut behaviors and significantly increases accuracy on standard distribution-shift benchmarks by up to $19$ p.p. Our code and dataset are publicly available at https://github.com/tobna/ForAug.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Seeing Through Circuits: Faithful Mechanistic Interpretability for Vision Transformers

    cs.AI 2026-04 unverdicted novelty 6.0

    Edge-based circuits in vision transformers can be automatically recovered to explain and steer model computations for classification and adversarial behaviors.