Large vision-language model alignment and misalignment: A survey through the lens of explainability

Dong Shu, Haiyan Zhao, Jingyu Hu, Weiru Liu, Ali Payani, Lu Cheng, Mengnan Du · 2025 · arXiv 2501.01346

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding

cs.LG · 2026-04-22 · unverdicted · novelty 7.0

PND reduces object hallucination in VLMs via a dual-path contrast during decoding that amplifies visual features and penalizes linguistic priors, achieving reported SOTA results on POPE, MME, and CHAIR without retraining.

Clearer Sight, Fewer Lies: Oriented Pickup Preference Optimization for Multimodal Hallucination Mitigation

cs.CV · 2026-06-29 · unverdicted · novelty 6.0 · 2 refs

OPPO is an evidence-aware preference optimization objective that contrasts faithful responses under varying visual evidence strengths to reduce hallucinations in MLLMs.

Towards Long-horizon Agentic Multimodal Search

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

LMM-Searcher uses file-based visual UIDs and a fetch tool plus 12K synthesized trajectories to fine-tune a multimodal agent that scales to 100-turn horizons and reaches SOTA among open-source models on MM-BrowseComp and MMSearch-Plus.

Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs

cs.AI · 2025-12-09 · unverdicted · novelty 6.0

State-of-the-art MLLMs show substantial inconsistency when reasoning over the same information presented in image, text, or mixed modalities, even after accounting for OCR errors, with inconsistency linked to visual factors and modality gap.

Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey

cs.RO · 2025-08-18 · unverdicted · novelty 5.0

This survey organizes large VLM-based VLA models for robotic manipulation into monolithic and hierarchical paradigms, reviews their integrations and datasets, and outlines future directions.

Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI

cs.LG · 2024-03-14 · unverdicted · novelty 2.0

A survey reviewing the integration of generative models with connected and automated vehicles to enhance predictive modeling, simulation accuracy, and decision-making.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI cs.LG · 2024-03-14 · unverdicted · none · ref 100
A survey reviewing the integration of generative models with connected and automated vehicles to enhance predictive modeling, simulation accuracy, and decision-making.

Large vision-language model alignment and misalignment: A survey through the lens of explainability

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer