arXiv preprint arXiv:2601.06965 , year=

Unified Personalized Understanding, Generating, Editing , author= · 2026 · arXiv 2601.06965

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

EgoCoT-Bench: Benchmarking Grounded and Verifiable Operation-Centric Chain of Thought Reasoning for MLLMs

cs.CV · 2026-05-19 · unverdicted · novelty 5.0

EgoCoT-Bench provides 3,172 verifiable QA pairs across perception, anticipation, and reasoning tasks on egocentric videos, revealing that many MLLMs give answer-correct but evidence-inconsistent explanations.

Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis

cs.CV · 2026-05-19 · unverdicted · novelty 5.0

TIF-GRPO uses integral feedback on pseudo-temporal trajectories to regulate anatomy-aware rewards in RL for clinical faithfulness in volumetric CT analysis.

CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

CrossView Suite supplies a 1.6M-sample dataset, scene-disjoint benchmark, and explicit-alignment framework to advance MLLMs from single-view perception to cross-view spatial intelligence.

citing papers explorer

Showing 3 of 3 citing papers.

EgoCoT-Bench: Benchmarking Grounded and Verifiable Operation-Centric Chain of Thought Reasoning for MLLMs cs.CV · 2026-05-19 · unverdicted · none · ref 57
EgoCoT-Bench provides 3,172 verifiable QA pairs across perception, anticipation, and reasoning tasks on egocentric videos, revealing that many MLLMs give answer-correct but evidence-inconsistent explanations.
Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis cs.CV · 2026-05-19 · unverdicted · none · ref 72
TIF-GRPO uses integral feedback on pseudo-temporal trajectories to regulate anatomy-aware rewards in RL for clinical faithfulness in volumetric CT analysis.
CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark cs.CV · 2026-05-18 · unverdicted · none · ref 51
CrossView Suite supplies a 1.6M-sample dataset, scene-disjoint benchmark, and explicit-alignment framework to advance MLLMs from single-view perception to cross-view spatial intelligence.

arXiv preprint arXiv:2601.06965 , year=

fields

years

verdicts

representative citing papers

citing papers explorer