hub Mixed citations

MuJoCo: A physics engine for model-based control

Todorov, E · 2012 · arXiv 2012.638610

Mixed citation behavior. Most common role is background (67%).

45 Pith papers citing it

Background 67% of classified citations

read on arXiv browse 45 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7 dataset 2

citation-polarity summary

background 6 use dataset 2 unclear 1

representative citing papers

SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects

cs.AI · 2026-05-19 · unverdicted · novelty 7.0

SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.

Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.

CelloCut: Constructive Watertight Remeshing via Tetrahedral Cell Cuts

cs.GR · 2026-05-18 · unverdicted · novelty 7.0

CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain

cs.NE · 2026-05-10 · unverdicted · novelty 7.0

An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.

EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

cs.RO · 2026-03-18 · unverdicted · novelty 7.0

HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.

Continuum Robot Localization using Distributed Time-of-Flight Sensors

cs.RO · 2026-02-06 · conditional · novelty 7.0

Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.

GLUE: Coordinating Pre-Trained Generative Models for System-Level Design

cs.CE · 2025-12-22 · conditional · novelty 7.0

GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.

Frictional Q-Learning

cs.LG · 2025-09-24 · unverdicted · novelty 7.0

Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.

BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion

cs.RO · 2025-08-11 · conditional · novelty 7.0

BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.

LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines

cs.LG · 2024-11-12 · unverdicted · novelty 7.0

LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.

Curriculum reinforcement learning with measurable task representation learning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

A VAE-based latent task representation enables automatic curriculum generation in CRL for non-Euclidean navigation tasks, outperforming interpolation and GAN-based methods in experiments.

ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders

cs.RO · 2026-05-19 · accept · novelty 6.0 · 2 refs

ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.

Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.

PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation

cs.CV · 2026-05-14 · conditional · novelty 6.0

PhyMotion scores generated human videos by grounding recovered 3D poses in a physics simulator across kinematic, contact, and dynamic axes, yielding stronger human correlation and larger RL post-training gains than prior 2D rewards.

R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

R2R2 introduces a non-centered regularization objective for SPL that addresses conflicts with spectral properties, leading to better performance on continuous control tasks at high UTD ratios.

Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.

Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation

cs.RO · 2026-04-30 · unverdicted · novelty 6.0

Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.

VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation

cs.RO · 2026-04-17 · unverdicted · novelty 6.0

VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

cs.LG · 2026-04-13 · unverdicted · novelty 6.0

Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

frax: Fast Robot Kinematics and Dynamics in JAX

cs.RO · 2026-04-05 · unverdicted · novelty 6.0

frax is a new open-source JAX library delivering low-microsecond CPU dynamics and over 100 million GPU evaluations per second for robot kinematics and dynamics with autodiff support.

HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control

cs.RO · 2026-02-03 · conditional · novelty 6.0

HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.

Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion

cs.RO · 2026-01-31 · unverdicted · novelty 6.0

MoE-based locomotion policy with RoboGauge metrics achieves reliable sim-to-real transfer, enabling robust quadrupedal walking on challenging unseen terrains up to 4 m/s.

citing papers explorer

Showing 45 of 45 citing papers.

SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects cs.AI · 2026-05-19 · unverdicted · none · ref 35
SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation cs.LG · 2026-05-18 · unverdicted · none · ref 91
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
CelloCut: Constructive Watertight Remeshing via Tetrahedral Cell Cuts cs.GR · 2026-05-18 · unverdicted · none · ref 45
CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.
Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain cs.NE · 2026-05-10 · unverdicted · none · ref 40
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates cs.CV · 2026-04-13 · unverdicted · none · ref 52
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness cs.RO · 2026-03-18 · unverdicted · none · ref 30
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
Continuum Robot Localization using Distributed Time-of-Flight Sensors cs.RO · 2026-02-06 · conditional · none · ref 41
Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.
GLUE: Coordinating Pre-Trained Generative Models for System-Level Design cs.CE · 2025-12-22 · conditional · none · ref 59
GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.
Frictional Q-Learning cs.LG · 2025-09-24 · unverdicted · none · ref 25
Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion cs.RO · 2025-08-11 · conditional · none · ref 77
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines cs.LG · 2024-11-12 · unverdicted · none · ref 1
LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.
Curriculum reinforcement learning with measurable task representation learning cs.LG · 2026-05-22 · unverdicted · none · ref 47
A VAE-based latent task representation enables automatic curriculum generation in CRL for non-Euclidean navigation tasks, outperforming interpolation and GAN-based methods in experiments.
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders cs.RO · 2026-05-19 · accept · none · ref 33 · 2 links
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing cs.LG · 2026-05-15 · unverdicted · none · ref 206
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation cs.CV · 2026-05-14 · conditional · none · ref 16
PhyMotion scores generated human videos by grounding recovered 3D poses in a physics simulator across kinematic, contact, and dynamic axes, yielding stronger human correlation and larger RL post-training gains than prior 2D rewards.
R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning cs.LG · 2026-05-13 · unverdicted · none · ref 25
R2R2 introduces a non-centered regularization objective for SPL that addresses conflicts with spectral properties, leading to better performance on continuous control tasks at high UTD ratios.
Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All? cs.AI · 2026-05-09 · unverdicted · none · ref 32
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation cs.RO · 2026-04-30 · unverdicted · none · ref 6
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation cs.RO · 2026-04-17 · unverdicted · none · ref 27
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators cs.LG · 2026-04-13 · unverdicted · none · ref 1
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control cs.LG · 2026-04-06 · unverdicted · none · ref 84 · 2 links
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
frax: Fast Robot Kinematics and Dynamics in JAX cs.RO · 2026-04-05 · unverdicted · none · ref 6
frax is a new open-source JAX library delivering low-microsecond CPU dynamics and over 100 million GPU evaluations per second for robot kinematics and dynamics with autodiff support.
HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control cs.RO · 2026-02-03 · conditional · none · ref 34
HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion cs.RO · 2026-01-31 · unverdicted · none · ref 57
MoE-based locomotion policy with RoboGauge metrics achieves reliable sim-to-real transfer, enabling robust quadrupedal walking on challenging unseen terrains up to 4 m/s.
Neural CDEs as Correctors for Learned Time Series Models cs.LG · 2025-12-13 · unverdicted · none · ref 19
Neural CDEs serve as correctors that reduce error accumulation in multi-step forecasts from learned time-series models across synthetic, physics, and real-world data.
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control cs.RO · 2025-11-11 · unverdicted · none · ref 52
Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning cs.RO · 2025-11-06 · unverdicted · none · ref 106
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data cs.RO · 2025-05-06 · unverdicted · none · ref 12
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets cs.LG · 2020-06-16 · unverdicted · none · ref 53
AWAC combines offline data with online RL via advantage-weighted actor-critic updates to enable faster acquisition of robotic skills such as dexterous manipulation.
Evolvability ES: Scalable and Direct Optimization of Evolvability cs.NE · 2019-07-13 · unverdicted · none · ref 42
Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.
Closed-Loop Sim-to-Real Reinforcement Learning for Deformable Microfiber Shape Control cs.RO · 2026-05-20 · unverdicted · none · ref 22
A closed-loop sim-to-real RL policy trained in a simplified frictionless simulator achieves sub-millimeter microfiber shape control on physical hardware via visual feedback without retraining.
SmoCap: Unified Scale-Pose Canonicalization with Proxy-Mapped Trust-Region QP cs.RO · 2026-05-20 · unverdicted · none · ref 25
SmoCap performs unified scale-pose canonicalization for motion capture by solving constrained trust-region QPs with analytical proxy-mapped Jacobians in a sparse control subspace.
Automatically Improving Simulation Physics for Articulated Objects cs.RO · 2026-05-18 · unverdicted · none · ref 14
A simulator-in-the-loop multi-modal method refines physical properties of incomplete 3D articulated objects to improve simulation stability and downstream robot policy performance.
Before the Body Moves: Learning Anticipatory Joint Intent for Language-Conditioned Humanoid Control cs.RO · 2026-05-14 · unverdicted · none · ref 3 · 2 links
DAJI is a hierarchical framework using distillation and autoregressive generation to learn future-aware joint intents for language-conditioned humanoid robot control.
Rethinking Priority Scheduling for Sequential Multi-Agent Decision Making in Stackelberg Games cs.MA · 2026-05-08 · unverdicted · none · ref 11
HPA dynamically selects agent decision orders in Stackelberg games to improve equilibria and performance in multi-agent MuJoCo control tasks.
Gated Memory Policy cs.RO · 2026-04-21 · unverdicted · none · ref 48
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation cs.RO · 2026-04-13 · unverdicted · none · ref 43
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
ARROW: Augmented Replay for RObust World models cs.LG · 2026-03-12 · unverdicted · none · ref 29
ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.
From Fold to Function: Simulation-Driven Design of Origami Mechanisms cs.RO · 2025-11-13 · conditional · none · ref 36
A simulation framework using MuJoCo deformable bodies and CMA-ES optimization enables rapid design and experimental validation of origami mechanisms like an improved catapult.
MOBIUS: A Multi-Modal Bipedal Robot that can Walk, Crawl, Climb, and Roll cs.RO · 2025-11-03 · unverdicted · none · ref 42
MOBIUS is a multi-modal bipedal robot with hybrid reinforcement learning and force control plus an MIQCP planner that enables walking, crawling, climbing, and rolling on varied terrains.
Geometric Analysis of Neural Regression Collapse via Intrinsic Dimension cs.LG · 2025-10-01 · unverdicted · none · ref 22
Neural regression collapse occurs when last-layer feature intrinsic dimension falls below target intrinsic dimension, creating over-compressed and under-compressed regimes that govern generalization based on data quantity and noise.
Behavior Synthesis via Contact-Aware Fisher Information Maximization cs.RO · 2025-05-18 · unverdicted · none · ref 50
Derives a contact-aware Fisher information measure to synthesize robot behaviors that maximize information-rich contacts for efficient object parameter learning.
Gymnasium: A Standard Interface for Reinforcement Learning Environments cs.LG · 2024-07-24 · accept · none · ref 31
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.
Latent Linear Quadratic Regulator for Robotic Control Tasks cs.RO · 2024-07-15 · unverdicted · none · ref 15
LaLQR learns a latent linear-quadratic representation of robotic systems by imitating MPC to enable efficient LQR control.
The embodied brain: Bridging the brain, body, and behavior with neuromechanical digital twins q-bio.NC · 2026-01-12 · unverdicted · none · ref 10
Neuromechanical digital twins embed neural controllers in simulated bodies to infer unmeasurable biophysical variables, generate testable hypotheses via perturbations, and bridge neuroscience with robotics and machine learning.

MuJoCo: A physics engine for model-based control

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer