Title resolution pending

Introduction to algorithms , author= · 2022

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

A sharp interaction-degree threshold for simulating QAOA

quant-ph · 2026-05-21 · unverdicted · novelty 6.0

QAOA sampling hardness has a sharp threshold at interaction degree 3, where depth-1 approximate sampling implies PH collapse to the third level, but degree-2 instances remain efficiently simulable at logarithmic depth.

Privacy Preserving Reinforcement Learning with One-Sided Feedback

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

POOL is a new RL algorithm that adds privacy protection in continuous spaces with one-sided feedback and achieves sample complexity matching known non-private lower bounds.

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

citing papers explorer

Showing 4 of 4 citing papers.

A sharp interaction-degree threshold for simulating QAOA quant-ph · 2026-05-21 · unverdicted · none · ref 7
QAOA sampling hardness has a sharp threshold at interaction degree 3, where depth-1 approximate sampling implies PH collapse to the third level, but degree-2 instances remain efficiently simulable at logarithmic depth.
Privacy Preserving Reinforcement Learning with One-Sided Feedback cs.LG · 2026-05-18 · unverdicted · none · ref 61
POOL is a new RL algorithm that adds privacy protection in continuous spaces with one-sided feedback and achieves sample complexity matching known non-private lower bounds.
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning cs.CL · 2026-04-11 · unverdicted · none · ref 211
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs cs.CL · 2026-04-11 · unverdicted · none · ref 226
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer