pith. sign in

Apo: Enhancing reasoning ability of mllms via asymmetric policy optimization.arXiv preprint arXiv:2506.21655

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.LG 2 cs.CV 1

years

2026 2 2025 1

verdicts

UNVERDICTED 3

clear filters

representative citing papers

Latent Visual Reasoning

cs.CV · 2025-09-29 · unverdicted · novelty 7.0

Latent Visual Reasoning enables autoregressive generation of latent visual states that reconstruct critical image tokens, yielding gains on perception-heavy VQA benchmarks such as 71.67% on MMVP.

PS-PPO: Prefix-Sampling PPO for Critic-Free RLHF

cs.LG · 2026-06-29 · unverdicted · novelty 6.0

PS-PPO samples prefixes of trajectories in critic-free RLHF and uses importance-weighted updates to reduce compute and memory while claiming to preserve the full-trajectory objective.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Latent Visual Reasoning cs.CV · 2025-09-29 · unverdicted · none · ref 10

    Latent Visual Reasoning enables autoregressive generation of latent visual states that reconstruct critical image tokens, yielding gains on perception-heavy VQA benchmarks such as 71.67% on MMVP.