Title resolution pending

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, Test-Time Scaling , author= · 2025

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

OProver: A Unified Framework for Agentic Formal Theorem Proving

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.

VLM-AR3L: Vision-Language Models for Absolute and Relative Rewards in Reinforcement Learning

cs.RO · 2026-07-01 · unverdicted · novelty 5.0 · 2 refs

VLM-AR3L learns absolute and relative reward models from VLM preference labels to improve RL on control, manipulation, and Minecraft tasks.

COPRA: Conditional Parameter Adaptation with Reinforcement Learning for Video Anomaly Detection

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

COPRA introduces conditional parameter adaptation via RL to dynamically tune frozen VLMs for video anomaly detection, outperforming static methods in in-domain and cross-domain settings while generalizing to other video tasks.

citing papers explorer

Showing 3 of 3 citing papers.

OProver: A Unified Framework for Agentic Formal Theorem Proving cs.CL · 2026-05-17 · unverdicted · none · ref 47
OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.
VLM-AR3L: Vision-Language Models for Absolute and Relative Rewards in Reinforcement Learning cs.RO · 2026-07-01 · unverdicted · none · ref 50 · 2 links
VLM-AR3L learns absolute and relative reward models from VLM preference labels to improve RL on control, manipulation, and Minecraft tasks.
COPRA: Conditional Parameter Adaptation with Reinforcement Learning for Video Anomaly Detection cs.CV · 2026-05-14 · unverdicted · none · ref 29
COPRA introduces conditional parameter adaptation via RL to dynamically tune frozen VLMs for video anomaly detection, outperforming static methods in in-domain and cross-domain settings while generalizing to other video tasks.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer