pith. sign in

hub

Advances in neural information processing systems , volume=

54 Pith papers cite this work. Polarity classification is still indexing.

54 Pith papers citing it

hub tools

citation-role summary

background 2 method 2

citation-polarity summary

years

2026 54

representative citing papers

DISA: Offline Importance Sampling for Distribution-Matching LLM-RL

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

DISA decouples partition function estimation using offline importance sampling for distribution-matching LLM-RL, matching or exceeding online baselines like FlowRL on math and code benchmarks while retaining more strategy diversity.

Relative Score Policy Optimization for Diffusion Language Models

cs.CL · 2026-05-11 · unverdicted · novelty 7.0

RSPO interprets reward advantages as targets for relative log-ratios in dLLMs, calibrating noisy estimates to stabilize RLVR training and achieve strong gains on planning tasks with competitive math reasoning performance.

Pessimism-Free Offline Learning in General-Sum Games via KL Regularization

cs.LG · 2026-04-30 · unverdicted · novelty 7.0 · 2 refs

KL regularization enables pessimism-free offline learning in general-sum games, recovering regularized Nash equilibria at accelerated rate O(1/n) via GANE and converging to coarse correlated equilibria at standard rate O(1/sqrt(n)+1/T) via GAMD.

Validity-Calibrated Reasoning Distillation

cs.LG · 2026-04-14 · unverdicted · novelty 7.0

Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.

citing papers explorer

Showing 50 of 54 citing papers.