Title resolution pending

· 1992

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

An Encoded Corrective Double Deep Q-Networks for Multi-Agent Control Systems

eess.SP · 2026-05-13 · unverdicted · novelty 6.0

A corrective double deep Q-network framework uses encoded message-passing to refine delayed and noisy global states for improved multi-agent control policies.

Self-Predictive Representation for Autonomous UAV Object-Goal Navigation

cs.RO · 2026-04-22 · unverdicted · novelty 6.0

AmelPredSto, a stochastic self-predictive representation model, outperforms other state representation learning approaches when combined with actor-critic RL for object-goal navigation in UAVs.

Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning

eess.SY · 2026-04-14 · unverdicted · novelty 6.0

A new robust Q-CBF framework synthesized via adversarial RL enables safety enforcement on the maximal robust safe set for black-box nonlinear systems.

Reinforcement Learning from Human Feedback

cs.LG · 2025-04-16 · unverdicted · novelty 2.0

The book introduces the origins, mathematical setup, and optimization stages of RLHF including reward modeling, reinforcement learning, rejection sampling, and direct alignment algorithms.

citing papers explorer

Showing 4 of 4 citing papers.

An Encoded Corrective Double Deep Q-Networks for Multi-Agent Control Systems eess.SP · 2026-05-13 · unverdicted · none · ref 15
A corrective double deep Q-network framework uses encoded message-passing to refine delayed and noisy global states for improved multi-agent control policies.
Self-Predictive Representation for Autonomous UAV Object-Goal Navigation cs.RO · 2026-04-22 · unverdicted · none · ref 35
AmelPredSto, a stochastic self-predictive representation model, outperforms other state representation learning approaches when combined with actor-critic RL for object-goal navigation in UAVs.
Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning eess.SY · 2026-04-14 · unverdicted · none · ref 17
A new robust Q-CBF framework synthesized via adversarial RL enables safety enforcement on the maximal robust safe set for black-box nonlinear systems.
Reinforcement Learning from Human Feedback cs.LG · 2025-04-16 · unverdicted · none · ref 240
The book introduces the origins, mathematical setup, and optimization stages of RLHF including reward modeling, reinforcement learning, rejection sampling, and direct alignment algorithms.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer