pith. sign in

hub

Advances in neural information processing systems , volume=

54 Pith papers cite this work. Polarity classification is still indexing.

54 Pith papers citing it

hub tools

citation-role summary

background 2 method 2

citation-polarity summary

years

2026 54

clear filters

representative citing papers

DISA: Offline Importance Sampling for Distribution-Matching LLM-RL

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

DISA decouples partition function estimation using offline importance sampling for distribution-matching LLM-RL, matching or exceeding online baselines like FlowRL on math and code benchmarks while retaining more strategy diversity.

Relative Score Policy Optimization for Diffusion Language Models

cs.CL · 2026-05-11 · unverdicted · novelty 7.0

RSPO interprets reward advantages as targets for relative log-ratios in dLLMs, calibrating noisy estimates to stabilize RLVR training and achieve strong gains on planning tasks with competitive math reasoning performance.

Pessimism-Free Offline Learning in General-Sum Games via KL Regularization

cs.LG · 2026-04-30 · unverdicted · novelty 7.0 · 2 refs

KL regularization enables pessimism-free offline learning in general-sum games, recovering regularized Nash equilibria at accelerated rate O(1/n) via GANE and converging to coarse correlated equilibria at standard rate O(1/sqrt(n)+1/T) via GAMD.

Validity-Calibrated Reasoning Distillation

cs.LG · 2026-04-14 · unverdicted · novelty 7.0

Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.

citing papers explorer

Showing 6 of 6 citing papers after filters.