QAP-Router models qubit routing as dynamic QAP and applies RL with a solution-aware Transformer to cut CNOT counts by 12-30% versus industry compilers on real circuit benchmarks.
Stable-baselines3: Reliable reinforcement learning implementations.Journal of machine learning research, 22(268):1–8
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
DRL with CBF safety filter enables guaranteed-safe spacecraft reorientation under pointing keep-out constraints via custom state representation and reward design, validated in Monte Carlo simulations.
ParallelCBF is a composable framework that unifies tensor-parallel UAV environments, hard-gate CBF safety filters, sharded BC-to-RL pipelines, and operational auditability as first-class APIs for safe reinforcement learning.
citing papers explorer
-
QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning
QAP-Router models qubit routing as dynamic QAP and applies RL with a solution-aware Transformer to cut CNOT counts by 12-30% versus industry compilers on real circuit benchmarks.
-
stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation
The paper presents stable-worldmodel (swm), a platform with high-performance data layer, modern world model baselines, planning solvers, and extended environments for reproducible research and generalization evaluation.
-
Safe Deep Reinforcement Learning for Spacecraft Reorientation with Pointing Keep-Out Constraint
DRL with CBF safety filter enables guaranteed-safe spacecraft reorientation under pointing keep-out constraints via custom state representation and reward design, validated in Monte Carlo simulations.
-
parallelcbf: A composable safety-filter and auditability framework for tensor-parallel reinforcement learning
ParallelCBF is a composable framework that unifies tensor-parallel UAV environments, hard-gate CBF safety filters, sharded BC-to-RL pipelines, and operational auditability as first-class APIs for safe reinforcement learning.