SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.
hub Mixed citations
MuJoCo: A physics engine for model-based control
Mixed citation behavior. Most common role is background (67%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.
GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.
Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.
A VAE-based latent task representation enables automatic curriculum generation in CRL for non-Euclidean navigation tasks, outperforming interpolation and GAN-based methods in experiments.
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
PhyMotion scores generated human videos by grounding recovered 3D poses in a physics simulator across kinematic, contact, and dynamic axes, yielding stronger human correlation and larger RL post-training gains than prior 2D rewards.
R2R2 introduces a non-centered regularization objective for SPL that addresses conflicts with spectral properties, leading to better performance on continuous control tasks at high UTD ratios.
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
frax is a new open-source JAX library delivering low-microsecond CPU dynamics and over 100 million GPU evaluations per second for robot kinematics and dynamics with autodiff support.
HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
MoE-based locomotion policy with RoboGauge metrics achieves reliable sim-to-real transfer, enabling robust quadrupedal walking on challenging unseen terrains up to 4 m/s.
citing papers explorer
-
SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects
SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.
-
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
-
CelloCut: Constructive Watertight Remeshing via Tetrahedral Cell Cuts
CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.
-
Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
-
EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
-
HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
-
Continuum Robot Localization using Distributed Time-of-Flight Sensors
Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.
-
GLUE: Coordinating Pre-Trained Generative Models for System-Level Design
GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.
-
Frictional Q-Learning
Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.
-
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
-
LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines
LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.
-
Curriculum reinforcement learning with measurable task representation learning
A VAE-based latent task representation enables automatic curriculum generation in CRL for non-Euclidean navigation tasks, outperforming interpolation and GAN-based methods in experiments.
-
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
-
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
-
PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation
PhyMotion scores generated human videos by grounding recovered 3D poses in a physics simulator across kinematic, contact, and dynamic axes, yielding stronger human correlation and larger RL post-training gains than prior 2D rewards.
-
R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning
R2R2 introduces a non-centered regularization objective for SPL that addresses conflicts with spectral properties, leading to better performance on continuous control tasks at high UTD ratios.
-
Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
-
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
-
VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
frax: Fast Robot Kinematics and Dynamics in JAX
frax is a new open-source JAX library delivering low-microsecond CPU dynamics and over 100 million GPU evaluations per second for robot kinematics and dynamics with autodiff support.
-
HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control
HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
-
Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion
MoE-based locomotion policy with RoboGauge metrics achieves reliable sim-to-real transfer, enabling robust quadrupedal walking on challenging unseen terrains up to 4 m/s.
-
Neural CDEs as Correctors for Learned Time Series Models
Neural CDEs serve as correctors that reduce error accumulation in multi-step forecasts from learned time-series models across synthetic, physics, and real-world data.
-
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control
Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.
-
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
-
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
-
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC combines offline data with online RL via advantage-weighted actor-critic updates to enable faster acquisition of robotic skills such as dexterous manipulation.
-
Evolvability ES: Scalable and Direct Optimization of Evolvability
Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.
-
Closed-Loop Sim-to-Real Reinforcement Learning for Deformable Microfiber Shape Control
A closed-loop sim-to-real RL policy trained in a simplified frictionless simulator achieves sub-millimeter microfiber shape control on physical hardware via visual feedback without retraining.
-
SmoCap: Unified Scale-Pose Canonicalization with Proxy-Mapped Trust-Region QP
SmoCap performs unified scale-pose canonicalization for motion capture by solving constrained trust-region QPs with analytical proxy-mapped Jacobians in a sparse control subspace.
-
Automatically Improving Simulation Physics for Articulated Objects
A simulator-in-the-loop multi-modal method refines physical properties of incomplete 3D articulated objects to improve simulation stability and downstream robot policy performance.
-
Before the Body Moves: Learning Anticipatory Joint Intent for Language-Conditioned Humanoid Control
DAJI is a hierarchical framework using distillation and autoregressive generation to learn future-aware joint intents for language-conditioned humanoid robot control.
-
Rethinking Priority Scheduling for Sequential Multi-Agent Decision Making in Stackelberg Games
HPA dynamically selects agent decision orders in Stackelberg games to improve equilibria and performance in multi-agent MuJoCo control tasks.
-
Gated Memory Policy
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
-
ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
-
ARROW: Augmented Replay for RObust World models
ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.
-
From Fold to Function: Simulation-Driven Design of Origami Mechanisms
A simulation framework using MuJoCo deformable bodies and CMA-ES optimization enables rapid design and experimental validation of origami mechanisms like an improved catapult.
-
MOBIUS: A Multi-Modal Bipedal Robot that can Walk, Crawl, Climb, and Roll
MOBIUS is a multi-modal bipedal robot with hybrid reinforcement learning and force control plus an MIQCP planner that enables walking, crawling, climbing, and rolling on varied terrains.
-
Geometric Analysis of Neural Regression Collapse via Intrinsic Dimension
Neural regression collapse occurs when last-layer feature intrinsic dimension falls below target intrinsic dimension, creating over-compressed and under-compressed regimes that govern generalization based on data quantity and noise.
-
Behavior Synthesis via Contact-Aware Fisher Information Maximization
Derives a contact-aware Fisher information measure to synthesize robot behaviors that maximize information-rich contacts for efficient object parameter learning.
-
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.
-
Latent Linear Quadratic Regulator for Robotic Control Tasks
LaLQR learns a latent linear-quadratic representation of robotic systems by imitating MPC to enable efficient LQR control.
-
The embodied brain: Bridging the brain, body, and behavior with neuromechanical digital twins
Neuromechanical digital twins embed neural controllers in simulated bodies to infer unmeasurable biophysical variables, generate testable hypotheses via perturbations, and bridge neuroscience with robotics and machine learning.