Mopo: Model-Based Ofﬂine Policy Optimization

Deployment Level 2 Inverter slot, chunk-boundary / replan policy Learned Level 2 Inverter replaces algorithmic ones, adaptive replan conditioned on FoM uncertainty, learned termination predictor, deeper hierarchies (meta-Inverter sele · 2000 · arXiv 2005.13239

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Mastering Atari with Discrete World Models

cs.LG · 2020-10-05 · accept · novelty 7.0

DreamerV2 reaches human-level performance on 55 Atari games by learning behaviors inside a separately trained discrete-latent world model.

SC3-Eval: Evaluating Robot Foundation Models via Self-Consistent Video Generation

cs.RO · 2026-06-17 · unverdicted · novelty 6.0

SC3-Eval enforces three consistencies on a video model to produce policy rollouts that correlate 0.929 with real-world performance across seven vision-language-action policies and reproduce observed failure modes.

Neuro-Inspired Inverse Learning for Planning and Control

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

The Inverter framework formalizes inverse learning to generate coherent multi-step trajectories, outperforming offline RL and diffusion baselines on D4RL maze tasks by 24% on average with 10-100x less inference time while also matching GRAPE fidelity on single-qubit gates at >1000x speed.

JD-BP: A Joint-Decision Generative Framework for Auto-Bidding and Pricing

cs.GT · 2026-04-07 · unverdicted · novelty 6.0

JD-BP jointly generates bids and pricing corrections via generative models, memory-less return-to-go, trajectory augmentation, and energy-based DPO to improve auto-bidding performance despite prediction errors and latency.

Safety, Security, and Cognitive Risks in World Models

cs.CR · 2026-04-01 · unverdicted · novelty 6.0

World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and DreamerV3.

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation

cs.RO · 2021-08-06 · accept · novelty 6.0

A comprehensive benchmark study of offline imitation learning methods on multi-stage robot manipulation tasks identifies key sensitivities to algorithm design, data quality, and stopping criteria while releasing all datasets and code.

Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support

cs.AI · 2026-05-19 · unverdicted · novelty 5.0

CCSS-IX is a context-conditioned structured simulator for wastewater digital twins that uses adaptive expert mixing and self-falsifying conformal decision rules to reduce unsafe actions while maintaining low prediction error on real plant and benchmark data.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Mopo: Model-Based Ofﬂine Policy Optimization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer