Layer-wise Laplacian energy of visual attention reveals hallucination emergence in MLLMs and enables LaSCD, a closed-form logit remapping strategy that mitigates hallucinations while preserving general performance.
Realrag: Retrieval-augmented realistic image generation via self-reflective contrastive learning
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.CV 3years
2026 3roles
background 2polarities
background 2representative citing papers
RAVA retrieves view-consistent target-subject images via a learned cross-instance embedding and LogDet subset selection, then uses them in a multi-reference generator to improve cross-subject viewpoint alignment.