OVOW reconstructs instance-level, simulation-ready 4D mesh scenes from monocular video via a four-stage training-free pipeline and introduces a new benchmark for structured Video-to-4D evaluation.
super hub Canonical reference
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Canonical reference. 76% of citing Pith papers cite this work as background.
abstract
Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training that uses a CPU based simulator and GPU for neural networks. We host the results and videos at \url{https://sites.google.com/view/isaacgym-nvidia} and isaac gym can be downloaded at \url{https://developer.nvidia.com/isaac-gym}.
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training that uses a CPU based simulator and GPU for neural networks. We host the results and videos at \url{https:
- background methods in non-inertial environments before hardware deployment. They also provide scalable training envi- ronments for learning-based controllers. Compared with reduced- and full-order analytical models, they capture effects such as multi-contact, self-collision, and actuator dynamics that are often simplified or neglected. Widely used simulators, including MuJoCo [ 67], Isaac Gym [ 68], Isaac Lab [ 69], PyBullet [ 70], and RaiSim [ 71], fairly accurately describe multi-rigid-body dynamics and
- background For the first term inCMSE, applying the central moment lemma from Lemma C.2: m2[ϕg]≤A1td+A 2 (L λ )2 t2d(140) For the second term inCMSE: π(|ϕg|2r) 1 r≤ ( Brtrdr +B 2r (L λ )2r t2rdr ) 1 r (141) ≤(B1t+B 2 (L λ )2 t2)d(142) m2e[g] 1 e≤B3 (L λ )2 td(143) For the third term inCMSE: π(|ϕ|2p) 1 p≤S1td(144) m2q(1+ 1 p )[g] 1 q≤S1+ 1 p (L λ )2+ 2 p t1+ 1 pd(145) Combining these results, we get (for the estimation ofEy∼p0|t[y|x]): ⏐⏐E [ µN(ϕ)−µ(ϕ) ]⏐⏐≤d N ( E 1 2 t 1 2 +E 1 L λt+E 1+ 1 2p (L λ )1+ 1 p t
- background to improve robustness under uncertainty [8], [24]. At the same time, reinforcement and imitation learning have enabled increasingly capable contact-rich behaviors, including in-hand manipulation, precision assembly, and coordinated multi-arm action [25], [26], with physics-based simulation serving as a key enabler for large-scale training, benchmarking, and sim- to-real transfer [27], [28]. Overall, macroscale dexterous manipulation is characterized by high-DOF embodiments, multimodal sensing, a
- dataset Overcooked-AI[81] Human-AI Coordination & Puzzles 2019 arXiv:1910.05789 EPyMARL[82] Grid-world Foraging 2020 arXiv:2006.07869 Robot Warehouse (RW ARE)[83] Multi-Robot Warehouse Logistics 2020 arXiv:2006.07869 Habitat 3.0[84] Interactive & Human-Robot Synergy 2023 arXiv:2310.13724 MA-Gym Cooperative Grid-world Settings 2021 GitHub: ma-gym VMAS[85] Vectorized 2D Physics Control 2022 arXiv:2207.03530 Isaac Gym[86] GPU-accelerated Physics Simulation 2021 arXiv:2108.10470 Part III: Standardized Suite
- background policies improved adaptability [26], [29], [15]. Large-scale training with curricula and parallel simulation accelerated learning and broadened generalization [16], [23], supported by sim-to-real techniques such as dynamics and domain randomization [21], [27]. Standard policy optimization back- bones (e.g., PPO) remain dominant [25], often paired with high-throughput simulators [14]. Model-based and compli- ant control approaches further complement learned policies for stable bipedal walking [22
- dataset are such that no single robot can reach across the entire table. Therefore, both robots must collaborate to complete the task, e.g., place the object at an intermediate location reachable by the other robot, which then completes the task by placing the object at the goal. Demonstration Data.We train our method entirely in simula- tion by replicating our hardware setup in the high-fidelity Isaac Gym simulator [42]. We collect pick-and-place demonstrations using a scripted controller that drives t
authors
co-cited works
representative citing papers
Dynamic isotropy, quantifying uniform center-of-mass acceleration capability, improves robot performance and enables omnidirectional locomotion, terrain traversal, and failure resilience in a spherical robot design.
PhysEditWorld is a new dataset of over 60 million frames from 12 UE5 cinematic scenes with synchronized multimodal signals and explicit gravity labels, built via replay to support physics-editable world models.
HARBOR is a new agentic harness framework that automates robot RL workflows end-to-end across 16 tasks in manipulation, locomotion, and dexterous control, matching or exceeding default configurations while enabling sim-to-real transfer.
MPPI is re-derived as EM on a probabilistic optimal control problem, producing a generalized EM-MPPI algorithm with convergence analysis for exponential families and explicit Gaussian cases.
CPPO is an on-policy contrastive RL method that derives advantages from contrastive Q-values for PPO optimization, outperforming prior CRL baselines in 14/18 tasks and matching or exceeding reward-based PPO in 12/18 tasks.
CoDi decomposes the multi-agent diffusion score into pre-trained single-agent policies plus a gradient-free cost guidance term to generate coordinated behavior from single-agent data alone.
A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher success rates and reduced training time.
HiPAN enables quadruped robots to navigate unstructured 3D environments more successfully by combining a high-level posture-adaptive policy with a low-level controller and curriculum learning on depth images.
HANDFUL learns resource-aware grasps using finger contact rewards and curriculum learning to improve success on sequential dexterous tasks in simulation and on a real LEAP hand.
Foot-mounted proximity sensors provide pre-contact feedback that, when integrated into RL, improves quadruped traversal robustness on discrete terrain with reliable sim-to-real transfer.
ICMPG combines LLM-based candidate generation with MPC-style physical simulation and semantic scoring to produce text-driven human motions that are both plausible and faithful.
TaskNPoint lets humanoid robots learn dynamic skills such as tennis backhands from single short human video demonstrations plus under one hour of single-GPU simulation training, achieving zero-shot generalization to new goal locations without per-task reward tuning.
Constrained RL with an explicit power budget reduces thruster power by 14-65% versus baselines across 12 simulated vehicle-task settings while preserving task performance in most cases.
TurboMPC delivers a JAX-CUDA MPC solver achieving up to 58x speedup over prior GPU solvers and scaling to 8000+ knot points on a full-scale car.
AnnotateAnything converts passive 3D assets into manipulation-ready assets by combining vision-language reasoning for semantics with parallel physics pipelines for executable action annotations such as grasps and articulations.
KPGrasp is a scalable Transformer flow-matching model using 3D hand keypoints that achieves 76.3% success on Dexonomy (47.4% improvement) and best average on DexGrasp Anything without contact losses or test-time refinement.
Video2Sim2Real turns a single human video into a deployable robot manipulation skill by reconstructing a digital twin, anchoring motions to object-centric simulator configurations, and bridging sim-to-real gaps with imitation learning and residual RL.
SIMPLE is a new large-scale simulation benchmark for humanoid loco-manipulation that integrates accurate dynamics and photorealistic rendering and demonstrates policy transfer from simulation to physical robots.
Proposes GPS representation for articulated parts, uses VR to annotate 41K frames across 234 objects, trains an RGB-D model, and achieves 73% success in heuristic manipulation policies on 9 objects.
EgoAERO reconstructs contact-consistent hand-object trajectories from single egocentric RGB-D videos without object assets via asset-free tracking and adaptive optimization, then trains robot policies with two-stage residual learning, achieving performance close to CAD-based methods.
GARDEN uses gravity alignment and conditional 3D point classification to factorize RGB reconstructions into explicit rigid bodies plus decoupled background for direct physics simulation.
EqGINO adds a spectral isotropy prior to FNOs to guarantee discrete equivariance and enable generalization to continuous SE(3) transformations on 3D PDEs with limited training data.
Any-ttach shows that rapid end-effector swapping combined with demonstration collection and task planning enables reliable multi-tool skills in long-horizon tasks such as sandwich making.
citing papers explorer
No citing papers match the current filters.