pith. sign in

hub Canonical reference

Solving Rubik's Cube with a Robot Hand

Canonical reference. 78% of citing Pith papers cite this work as background.

47 Pith papers citing it
Background 78% of classified citations
abstract

We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain randomization (ADR) and a robot platform built for machine learning. ADR automatically generates a distribution over randomized environments of ever-increasing difficulty. Control policies and vision state estimators trained with ADR exhibit vastly improved sim2real transfer. For control policies, memory-augmented models trained on an ADR-generated distribution of environments show clear signs of emergent meta-learning at test time. The combination of ADR with our custom robot platform allows us to solve a Rubik's cube with a humanoid robot hand, which involves both control and state estimation problems. Videos summarizing our results are available: https://openai.com/blog/solving-rubiks-cube/

hub tools

citation-role summary

background 8 baseline 1

citation-polarity summary

clear filters

representative citing papers

Betting for Sim-to-Real Performance Evaluation

cs.RO · 2026-04-27 · unverdicted · novelty 7.0

Betting mechanisms can yield provably more accurate and efficient estimates of real-world robot behavior than Monte Carlo sampling under specified conditions, with practical approximations demonstrated on synthetic data and a robotic manipulator task.

SynthPID: P&ID digitization from Topology-Preserving Synthetic Data

cs.CV · 2026-04-15 · conditional · novelty 7.0

Topology-preserving synthetic P&IDs generated by seeding from real drawings enable models trained solely on synthetics to achieve 63.8% edge mAP on real P&ID benchmarks, closing most of the gap to real-data training.

Learning to Play Piano in the Real World

cs.RO · 2025-03-19 · unverdicted · novelty 7.0

A Sim2Real2Sim learning pipeline enables a real-world dexterous robot to play piano pieces including Happy Birthday and Ode to Joy with an average F1-score of 0.881.

Mind the Sim-to-Real Gap & Think Like a Scientist

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

The paper decomposes simulator value errors into identifiable shifts and irreducible residuals, shows passive learning fails on reachability, and introduces Fisher-SEP to minimize posterior value variance via targeted experiments.

Trajectory-based actuator identification via differentiable simulation

cs.RO · 2026-04-11 · unverdicted · novelty 6.0

Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.

Learning Dexterous Grasping from Sparse Taxonomy Guidance

cs.RO · 2026-04-05 · unverdicted · novelty 6.0

GRIT learns dexterous grasping from sparse taxonomy guidance, achieving 87.9% success and better generalization to novel objects via a two-stage prediction-plus-policy approach.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution cs.CL · 2023-09-28 · unverdicted · none · ref 246 · internal anchor

    Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.

  • Language Models (Mostly) Know What They Know cs.CL · 2022-07-11 · unverdicted · none · ref 92 · internal anchor

    Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

  • A General Language Assistant as a Laboratory for Alignment cs.CL · 2021-12-01 · conditional · none · ref 37 · internal anchor

    Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.