Prox- ythinker: Test-time guidance through small visual reasoners.arXiv preprint arXiv:2505.24872

Zilin Xiao, Jaywon Koo, Siru Ouyang, Jefferson Hernandez, Yu Meng, Vicente Ordonez · 2025 · arXiv 2505.24872

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration

cs.LG · 2026-04-13 · unverdicted · novelty 7.0

NExt accelerates RLVR training for LLMs by nonlinearly extrapolating low-rank parameter trajectories extracted from LoRA runs.

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

cs.AI · 2026-01-08 · unverdicted · novelty 6.0

GlimpRouter uses the entropy of the first token in each reasoning step to decide whether to invoke a large model, yielding 10.7% higher accuracy and 25.9% lower latency than a standalone large model on AIME25.

OneThinker: All-in-one Reasoning Model for Image and Video

cs.CV · 2025-12-02 · unverdicted · novelty 5.0

OneThinker unifies image and video reasoning in one model across 10 tasks via a 600k corpus, CoT-annotated SFT, and EMA-GRPO reinforcement learning, reporting strong results on 31 benchmarks plus some cross-task transfer.

citing papers explorer

Showing 3 of 3 citing papers.

Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration cs.LG · 2026-04-13 · unverdicted · none · ref 35
NExt accelerates RLVR training for LLMs by nonlinearly extrapolating low-rank parameter trajectories extracted from LoRA runs.
GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts cs.AI · 2026-01-08 · unverdicted · none · ref 20
GlimpRouter uses the entropy of the first token in each reasoning step to decide whether to invoke a large model, yielding 10.7% higher accuracy and 25.9% lower latency than a standalone large model on AIME25.
OneThinker: All-in-one Reasoning Model for Image and Video cs.CV · 2025-12-02 · unverdicted · none · ref 45
OneThinker unifies image and video reasoning in one model across 10 tasks via a 600k corpus, CoT-annotated SFT, and EMA-GRPO reinforcement learning, reporting strong results on 31 benchmarks plus some cross-task transfer.

Prox- ythinker: Test-time guidance through small visual reasoners.arXiv preprint arXiv:2505.24872

fields

years

verdicts

representative citing papers

citing papers explorer