Advances in neural information processing systems , volume=

Reinforcement learning algorithm for partially observable Markov decision problems , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Structural Equivalence and Learning Dynamics in Delayed MARL

cs.LG · 2026-05-05 · accept · novelty 8.0

Observation and action delays are formally equivalent in cooperative Dec-POMDPs, yielding identical optimal solutions and enabling zero-shot transfer, though learning dynamics differ due to credit assignment and operational constraints.

Learning Interactive Real-World Simulators

cs.AI · 2023-10-09 · conditional · novelty 7.0

UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.

citing papers explorer

Showing 2 of 2 citing papers.

Structural Equivalence and Learning Dynamics in Delayed MARL cs.LG · 2026-05-05 · accept · none · ref 13
Observation and action delays are formally equivalent in cooperative Dec-POMDPs, yielding identical optimal solutions and enabling zero-shot transfer, though learning dynamics differ due to credit assignment and operational constraints.
Learning Interactive Real-World Simulators cs.AI · 2023-10-09 · conditional · none · ref 125
UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.

Advances in neural information processing systems , volume=

fields

years

verdicts

representative citing papers

citing papers explorer