VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.
Ziegler and Ryan Lowe and Chelsea Voss and Alec Radford and Dario Amodei and Paul F
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3representative citing papers
MGDA-Decoupled applies geometry-based multi-objective optimization within the DPO framework to find shared descent directions that account for each objective's convergence dynamics, yielding higher win rates on UltraFeedback.