Identical values are merged

Hyperparameter SimbaV1, SimbaV1+MINTO DMC-Hard HumanoidBench MuJoCo Discount Factor (γ) 0 · 2000

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning

cs.LG · 2025-10-02 · unverdicted · novelty 6.0

MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.

citing papers explorer

Showing 1 of 1 citing paper.

Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning cs.LG · 2025-10-02 · unverdicted · none · ref 16
MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.

Identical values are merged

fields

years

verdicts

representative citing papers

citing papers explorer