pith. sign in

hub

Qwq-32b: Embracing the power of reinforcement learning, March 2025

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

background 2

citation-polarity summary

years

2026 3 2025 8

roles

background 2

polarities

background 2

representative citing papers

GRIT: Teaching MLLMs to Think with Images

cs.CV · 2025-05-21 · unverdicted · novelty 7.0

GRIT introduces a grounded reasoning paradigm for MLLMs where reasoning chains interleave text and bounding boxes, trained via GRPO-GR reinforcement learning on as few as 20 examples without annotations.

Skywork Open Reasoner 1 Technical Report

cs.LG · 2025-05-28 · conditional · novelty 4.0

Skywork-OR1 uses RL on distilled CoT models to lift math and coding benchmark accuracy by 13-15 points while open-sourcing everything.

citing papers explorer

Showing 11 of 11 citing papers.