JoyAI-Sim: A Simulation-Enabled Interconversion Toolchain for the Embodied Data Pyramid
Pith reviewed 2026-06-30 10:28 UTC · model grok-4.3
The pith
JoyAI-Sim provides bidirectional pathways that convert real robot tasks into simulations for human evaluation and lift human demonstrations into robot trajectories while enforcing physical constraints.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
JoyAI-Sim establishes two complementary pathways denoted Robot ⇌ Simulation ⇌ Human. The Robot → Simulation → Human direction reconstructs real-robot tabletop organization tasks as calibrated digital twins for scalable evaluation and applies human embodied feedback to refine simulated motion naturalness. The Human → Simulation → Robot direction lifts ego-centric human demonstrations into simulation, verifies them against robot physical constraints, and converts them into robot-centered trajectories, annotations, and visual observations. The JoySim simulator thereby serves simultaneously as a scalable evaluation layer and a physical consistency filter for robot data generation, with the recon
What carries the argument
The bidirectional Robot ⇌ Simulation ⇌ Human pathways with JoySim acting as both scalable evaluation layer and physical consistency filter.
If this is right
- Real-robot trials can be evaluated at larger scale without repeated physical execution.
- Human demonstrations become convertible into trajectories that already satisfy robot physical limits.
- Model evaluation gains an explicit human-robot alignment step through embodied feedback on simulated motions.
- Data generation pipelines gain an early filter that discards physically inconsistent motions before robot deployment.
- Core modules become reusable cloud infrastructure rather than one-off local setups.
Where Pith is reading between the lines
- The same interconversion pattern could reduce the fraction of development time spent on physical hardware across a wider range of manipulation tasks.
- If the digital-twin calibration step generalizes, similar toolchains might support sim-to-real transfer testing for non-tabletop environments.
- Cloud packaging of the modules opens the possibility of shared community datasets generated under consistent physical filters.
- The realism-augmentation modules might be tested independently to measure their isolated effect on downstream policy performance.
Load-bearing premise
Real-robot tabletop tasks can be accurately rebuilt as calibrated digital twins in simulation and human embodied feedback can reliably inspect and refine the naturalness of simulated motions.
What would settle it
A side-by-side test in which policies trained or evaluated exclusively through the toolchain produce measurably lower success rates or higher failure modes when deployed on the original physical robots compared with policies trained only on real-robot data.
read the original abstract
Generalist robot policies require trustworthy evaluation and robot-usable training data, but both are difficult to scale with physical robots alone. Real-robot trials and demonstrations remain the most faithful source of deployment signals, yet they are slow, costly, and hard to reproduce. We present JoyAI-Sim, a simulation-enabled interconversion toolchain for human-robot aligned model evaluation and data generation, denoted as Robot $\rightleftharpoons$ Simulation $\rightleftharpoons$ Human. On the one hand, the Robot $\rightarrow$ Simulation $\rightarrow$ Human pathway supports human-robot aligned model evaluation by reconstructing real-robot tabletop organization tasks as calibrated digital twins for scalable evaluation, while using human embodied feedback to inspect and refine the naturalness of simulated motions. On the other hand, the Human $\rightarrow$ Simulation $\rightarrow$ Robot pathway supports human-robot aligned data generation: it lifts ego-centric human demonstrations into simulation, checks them under robot physical constraints, and converts them into robot-centered trajectories, annotations, and visual observations. Together, these pathways use the JoySim simulator as both a scalable evaluation layer and a physical consistency filter for robot data generation. We further package the core reconstruction, simulation, rendering, and realism-augmentation modules as cloud services on JD Cloud, turning the system into reusable infrastructure for robot data generation and model evaluation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents JoyAI-Sim, a simulation-enabled interconversion toolchain for human-robot aligned model evaluation and data generation, denoted as Robot ⇌ Simulation ⇌ Human. It describes two pathways: Robot → Simulation → Human, which reconstructs real-robot tabletop tasks as calibrated digital twins for scalable evaluation and uses human embodied feedback to refine simulated motion naturalness; and Human → Simulation → Robot, which lifts ego-centric human demonstrations into simulation, enforces robot physical constraints, and converts them to robot-centered trajectories and annotations. The JoySim simulator is positioned as both an evaluation layer and physical consistency filter, with core modules packaged as JD Cloud services for reusable infrastructure.
Significance. If the described reconstruction, feedback, and conversion mechanisms can be validated, the toolchain could provide meaningful infrastructure for scaling trustworthy evaluation and data generation in generalist robot policies, addressing the cost and reproducibility limits of physical-robot-only approaches while enabling human-robot alignment.
major comments (3)
- [Abstract] Abstract: The claims that the pathways deliver 'human-robot aligned model evaluation' and that JoySim serves as a 'scalable evaluation layer and a physical consistency filter' are load-bearing for the central contribution, yet the manuscript provides no experimental results, validation metrics (e.g., reconstruction pose/trajectory error, physics parameter fidelity), error analysis, or baseline comparisons to support effectiveness or alignment.
- [Robot → Simulation → Human pathway] Robot → Simulation → Human pathway description: The assumption that real-robot tabletop organization tasks can be accurately reconstructed as calibrated digital twins and that human embodied feedback reliably inspects/refines motion naturalness is presented without any reported quantitative metrics on reconstruction fidelity or inter-rater reliability of naturalness judgments; if either fails, the evaluation pathway cannot deliver the claimed trustworthiness.
- [Human → Simulation → Robot pathway] Human → Simulation → Robot pathway description: No details or results are given on the accuracy of physical-constraint checking, the fidelity of converted robot-centered trajectories, or how annotations/visual observations are generated, which is required to substantiate the data-generation claims.
minor comments (2)
- The bidirectional notation (Robot ⇌ Simulation ⇌ Human) is introduced in the abstract but would benefit from an accompanying diagram or explicit definition of the interconversion steps in the main text for clarity.
- The manuscript refers to 'core reconstruction, simulation, rendering, and realism-augmentation modules' without specifying their implementation details, input/output formats, or open-source availability, which would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments highlighting the need for empirical grounding of the toolchain claims. The manuscript is a system description of JoyAI-Sim and its cloud services; we address each point by clarifying scope and proposing targeted revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claims that the pathways deliver 'human-robot aligned model evaluation' and that JoySim serves as a 'scalable evaluation layer and a physical consistency filter' are load-bearing for the central contribution, yet the manuscript provides no experimental results, validation metrics (e.g., reconstruction pose/trajectory error, physics parameter fidelity), error analysis, or baseline comparisons to support effectiveness or alignment.
Authors: We agree the claims are load-bearing and currently unsupported by metrics. The paper presents the architectural design and intended use of the pathways rather than validated performance. In revision we will rephrase the abstract to indicate design intent (e.g., 'is designed to support' instead of 'supports') and add an explicit Future Work section outlining planned quantitative validation, including the suggested metrics on reconstruction error and physics fidelity. revision: yes
-
Referee: [Robot → Simulation → Human pathway] Robot → Simulation → Human pathway description: The assumption that real-robot tabletop organization tasks can be accurately reconstructed as calibrated digital twins and that human embodied feedback reliably inspects/refines motion naturalness is presented without any reported quantitative metrics on reconstruction fidelity or inter-rater reliability of naturalness judgments; if either fails, the evaluation pathway cannot deliver the claimed trustworthiness.
Authors: We acknowledge the absence of quantitative metrics on reconstruction fidelity and naturalness judgment reliability. The current text describes the system modules and workflow. We will add a Limitations subsection discussing these assumptions and the conditions under which the pathway may not achieve the intended trustworthiness. Because no such experiments were performed for this submission, specific numerical results cannot be inserted. revision: partial
-
Referee: [Human → Simulation → Robot pathway] Human → Simulation → Robot pathway description: No details or results are given on the accuracy of physical-constraint checking, the fidelity of converted robot-centered trajectories, or how annotations/visual observations are generated, which is required to substantiate the data-generation claims.
Authors: We agree that implementation-level details and accuracy results for constraint checking, trajectory fidelity, and annotation generation are not provided. We will expand the relevant section with additional pseudocode and module descriptions for how physical constraints are enforced and how robot-centered outputs are produced. Accuracy metrics remain unavailable without new experiments and will be noted as future work. revision: partial
Circularity Check
No circularity: purely architectural description of toolchain with no derivations, fits, or predictions
full rationale
The paper presents JoyAI-Sim as a simulation toolchain with two pathways (Robot→Simulation→Human and Human→Simulation→Robot) that use JoySim for evaluation and data generation. No equations, parameters, predictions, or derivations appear in the abstract or described content. The description is infrastructural and does not reduce any claim to a self-citation, fit, or definitional loop. This matches the default expectation of no circularity for non-mathematical system papers.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Real-robot tasks can be accurately reconstructed as calibrated digital twins and human feedback can refine simulated motions for naturalness
Reference graph
Works this paper leans on
-
[1]
Jad Abou-Chakra, Lingfeng Sun, Krishan Rana, Brandon May, Karl Schmeckpeper, Maria Vittoria Minniti, and Laura Herlant. Real-is-sim: Bridging the sim-to-real gap with a dynamic digital twin for real-world robot policy evaluation. arXiv preprint arXiv:2504.03597, 2025
-
[2]
Hassan Abu Alhaija, Jose Alvarez, Maciej Bala, et al. Cosmos-transfer1: Conditional world generation with adaptive multimodal control.arXiv preprint arXiv:2503.14492, 2025
-
[3]
World Simulation with Video Foundation Models for Physical AI
Arslan Ali, Junjie Bai, Maciej Bala, Yogesh Balaji, Aaron Blakeman, Tiffany Cai, Jiaxin Cao, Tianshi Cao, Elizabeth Cha, Yu-Wei Chao, et al. World simulation with video foundation models for physical ai.arXiv preprint arXiv:2511.00062, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Ikflow: Generating diverse inverse kinematics solutions,
Barrett Ames, Jeremy Morgan, and George Konidaris. Ikflow: Generating diverse inverse kinematics solutions,
- [5]
-
[6]
Roboarena: Distributed real-world evaluation of generalist robot policies
Pranav Atreya, Karl Pertsch, Tony Lee, Moo Jin Kim, Arhan Jain, Artur Kuramshin, Clemens Eppner, Cyrus Neary, Edward Hu, Fabio Ramos, et al. Roboarena: Distributed real-world evaluation of generalist robot policies. arXiv preprint arXiv:2506.18123, 2025
-
[7]
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, et al. Rt-1: Robotics transformer for real-world control at scale.arXiv preprint arXiv:2212.06817, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[8]
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. Rt-2: Vision-language-action models transfer web knowledge to robotic control.arXiv preprint arXiv:2307.15818, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Zixuan Li, Qiwei Liang, Xianliang Lin, Yiheng Ge, Zhenyu Gu, et al. Robotwin 2.0: A scalable data generator and benchmark with strong domain randomization for robust bimanual robotic manipulation.arXiv preprint arXiv:2506.18088, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[10]
X-sim: Cross-embodiment learning via real-to-sim-to-real
Prithwish Dan, Kushal Kedia, Angela Chao, Edward Weiyi Duan, Maximus Adrian Pace, Wei-Chiu Ma, and Sanjiban Choudhury. X-sim: Cross-embodiment learning via real-to-sim-to-real. InProceedings of the Conference on Robot Learning (CoRL), 2025
2025
-
[11]
Alejandro Escontrela, Justin Kerr, Arthur Allshire, Jonas Frey, Rocky Duan, Carmelo Sferrazza, and Pieter Abbeel. GaussGym: An open-source real-to-sim framework for learning locomotion from pixels.arXiv preprint arXiv:2510.15352, 2025
-
[12]
Rebot: Scaling robot learning with real-to-sim-to-real robotic video synthesis, 2025
YuFang, YueYang, XinghaoZhu, KaiyuanZheng, GedasBertasius, DanielSzafir, andMingyuDing. Rebot: Scaling robot learning with real-to-sim-to-real robotic video synthesis, 2025. URLhttps://arxiv.org/abs/2503.14526
-
[13]
Ego4d: Around the world in 3,000 hours of egocentric video
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18995–19012, 2022
2022
-
[14]
Jhen Hsieh, Kuan-Hsun Tu, Kuo-Han Hung, and Tsung-Wei Ke. Dexman: Learning bimanual dexterous manipulation from human and generated videos.arXiv preprint arXiv:2510.08475, 2025
-
[15]
Comparison between behavior trees and finite state machines, 2024
Matteo Iovino, Julian Förster, Pietro Falco, Jen Jen Chung, Roland Siegwart, and Christian Smith. Comparison between behavior trees and finite state machines, 2024. URLhttps://arxiv.org/abs/2405.16137
-
[16]
Stephen James, Zicong Ma, David Rovick Arrojo, and Andrew J. Davison. Rlbench: The robot learning benchmark and learning environment.IEEE Robotics and Automation Letters, 5(2):3019–3026, 2020
2020
-
[17]
Guangqi Jiang, Haoran Chang, Ri-Zhao Qiu, Yutong Liang, Mazeyu Ji, Jiyue Zhu, Zhao Dong, Xueyan Zou, and Xiaolong Wang. GSWorld: Closed-loop photo-realistic simulation suite for robotic manipulation.arXiv preprint arXiv:2510.20813, 2025
-
[18]
Rl-driven data generation for robust vision-based dexterous grasping
Atsushi Kanehira, Naoki Wake, Kazuhiro Sasabuchi, Jun Takamatsu, and Katsushi Ikeuchi. Rl-driven data generation for robust vision-based dexterous grasping. ArXiv, abs/2504.18084, 2025. URL https: //api.semanticscholar.org/CorpusID:278129761. 22
-
[19]
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, et al. Droid: A large-scale in-the-wild robot manipulation dataset.arXiv preprint arXiv:2403.12945, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[20]
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, and Sergey Levine. Openvla: An open-source vision-language-action model.arXiv preprint arXiv:2406.09246, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[21]
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. AI2-THOR: An interactive 3d environment for visual AI.arXiv preprint arXiv:1712.05474, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[22]
Teleopbench: A simulator-centric benchmark for dual-arm dexterous teleoperation, 2025
Hangyu Li, Qin Zhao, Haoran Xu, Xinyu Jiang, Qingwei Ben, Feiyu Jia, Haoyu Zhao, Liang Xu, Jia Zeng, Hanqing Wang, Bo Dai, Junting Dong, and Jiangmiao Pang. Teleopbench: A simulator-centric benchmark for dual-arm dexterous teleoperation, 2025. URLhttps://arxiv.org/abs/2505.12748
-
[23]
Evaluating real-world robot manipulation policies in simulation
Xuanlin Li, Kyle Hsu, Jiayuan Gu, Karl Pertsch, Oier Mees, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, and Ted Xiao. Evaluating real-world robot manipulation policies in simulation. InConference on Robot Learning (CoRL), 2024
2024
-
[24]
EgoLive: A Large-Scale Egocentric Dataset from Real-World Human Tasks
Yihang Li, Xuelong Wei, Jingzhou Luo, Yingjing Xiao, Yibo Bai, Guangyuan Zhou, Teng Zou, Chenguang Gui, Jiajun Wen, He Zhang, Kangliang Chen, Xing Pan, Shuaiyan Liu, Daming Wang, Tao An, Jiayi Li, Shibo Jin, Wanwan Zhang, Tianyu Wang, Boren Wei, Zhixuan Huang, Fangsheng Liu, Ruodai Li, Hui Zhang, Anson Li, Yicheng Gong, Peng Cao, Jiaming Liang, and Liang ...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[25]
The robot’s inner critic: Self-refinement of social behaviors through vlm-based replanning, 2026
Jiyu Lim, Youngwoo Yoon, and Kwanghyun Park. The robot’s inner critic: Self-refinement of social behaviors through vlm-based replanning, 2026. URLhttps://arxiv.org/abs/2603.20164
-
[26]
Libero: Benchmarking knowledge transfer for lifelong robot learning
Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. Libero: Benchmarking knowledge transfer for lifelong robot learning. InThirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023
2023
-
[27]
Haozhe Lou, Yurong Liu, Yike Pan, Yiran Geng, Jianteng Chen, Wenlong Ma, Chenglong Li, Lin Wang, Hengzhen Feng, Lu Shi, Liyi Luo, and Yongliang Shi. Robo-gs: A physics consistent spatial-temporal model for robotic arm with hybrid representation. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 15379–15386, 2025. doi: 10.1109/I...
-
[28]
Isaac gym: High performance gpu-based physics simulation for robot learning
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, and Gavriel State. Isaac gym: High performance gpu-based physics simulation for robot learning. InAdvances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2021
2021
-
[29]
RoboTurk: A Crowdsourcing Platform for Robotic Skill Learning through Imitation
Ajay Mandlekar, Yuke Zhu, Animesh Garg, Jonathan Booher, Max Spero, Albert Tung, Julian Gao, John Emmons, Anchit Gupta, Emre Orbay, Silvio Savarese, and Li Fei-Fei. Roboturk: A crowdsourcing platform for robotic skill learning through imitation, 2018. URLhttps://arxiv.org/abs/1811.02790
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[30]
MimicGen: A data generation system for scalable robot learning using human demonstrations
Ajay Mandlekar, Soroush Nasiriany, Bowen Wen, Iretiayo Akinola, Yashraj Narang, Linxi Fan, Yuke Zhu, and Dieter Fox. MimicGen: A data generation system for scalable robot learning using human demonstrations. In Conference on Robot Learning (CoRL), 2023
2023
-
[31]
Robotwin: Dual-arm robot benchmark with generative digital twins
Yao Mu, Tianxing Chen, Zanxin Chen, Shijia Peng, Zhiqian Lan, Zeyu Gao, Zhixuan Liang, Qiaojun Yu, Yude Zou, Mingkun Xu, et al. Robotwin: Dual-arm robot benchmark with generative digital twins. InProceedings of the computer vision and pattern recognition conference, pages 27649–27660, 2025
2025
-
[32]
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, and Yuke Zhu. Robocasa: Large-scale simulation of everyday tasks for generalist robots. arXiv preprint arXiv:2406.02523, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Open x-embodiment: Robotic learning datasets and rt-x models
Abby O’Neill, Abdul Rehman, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, et al. Open x-embodiment: Robotic learning datasets and rt-x models. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 6892–6903,
-
[34]
doi: 10.1109/ICRA57147.2024.10611477. 23
-
[35]
A real-to-sim-to-real approach to robotic manipulation with vlm-generated iterative keypoint rewards
Shivansh Patel, Xinchen Yin, Wenlong Huang, Shubham Garg, Hooshang Nayyeri, Li Fei-Fei, Svetlana Lazebnik, and Yunzhu Li. A real-to-sim-to-real approach to robotic manipulation with vlm-generated iterative keypoint rewards. InProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025
2025
-
[36]
Reconstructing hands in 3D with transformers
Georgios Pavlakos, Dandan Shan, Ilija Radosavovic, Angjoo Kanazawa, David Fouhey, and Jitendra Malik. Reconstructing hands in 3D with transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
-
[37]
Sim-to-real transfer of robotic control with dynamics randomization
Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. InIEEE International Conference on Robotics and Automation, 2018
2018
-
[38]
Habitat: A platform for embodied AI research
Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, and Dhruv Batra. Habitat: A platform for embodied AI research. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
2019
-
[39]
Tchapmi, Micael E
Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Claudia Pérez-D’Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Josiah Wong, Li Fei-Fei, and Silvio Savarese. iGibson 1.0: A simulation environment for interactive tasks in large realistic scenes. In Proceedings of the IEEE/RSJ Inte...
2021
-
[40]
EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration
Modi Shi, Shijia Peng, Jin Chen, Haoran Jiang, Yinghui Li, Di Huang, Ping Luo, Hongyang Li, and Li Chen. Egohumanoid: Unlocking in-the-wild loco-manipulation with robot-free egocentric demonstration.arXiv preprint arXiv:2602.10106, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[41]
Maniparena: Comprehensive real-world evaluation of reasoning-oriented generalist robot manipulation
Yu Sun, Meng Cao, Ping Yang, Rongtao Xu, Yunxiao Yan, Runze Xu, Liang Ma, Roy Gan, Andy Zhai, Qingxuan Chen, et al. Maniparena: Comprehensive real-world evaluation of reasoning-oriented generalist robot manipulation. arXiv preprint arXiv:2603.28545, 2026
-
[42]
Domain randomization for transferring deep neural networks from simulation to the real world
Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. InIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017
2017
-
[43]
Reconciling reality through simulation: A real- to-sim-to-real approach for robust manipulation,
Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, and Pulkit Agrawal. Reconciling reality through simulation: A real-to-sim-to-real approach for robust manipulation.arXiv preprint arXiv:2403.03949, 2024
-
[44]
Bridgedata v2: A dataset for robot learning at scale.arXiv preprint arXiv:2308.12952, 2023
Homer Walke, Kevin Black, Abraham Lee, et al. Bridgedata v2: A dataset for robot learning at scale.arXiv preprint arXiv:2308.12952, 2023
-
[45]
GenSim: Generating robotic simulation tasks via large language models, 2023
Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, and Xiaolong Wang. GenSim: Generating robotic simulation tasks via large language models, 2023
2023
-
[46]
Rl-gsbridge: 3d gaussian splatting based real2sim2real method for robotic manipulation learning
Yuxuan Wu, Lei Pan, Wenhua Wu, Guangming Wang, Yanzi Miao, Fan Xu, and Hesheng Wang. Rl-gsbridge: 3d gaussian splatting based real2sim2real method for robotic manipulation learning. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 192–198. IEEE, 2025
2025
-
[47]
Chang, Leonidas J
Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, and Hao Su. SAPIEN: A simulated part- based interactive environment. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
2020
-
[48]
Rldg: Robotic generalist policy distillation via reinforcement learning
Charles Xu, Qiyang Li, Jianlan Luo, and Sergey Levine. Rldg: Robotic generalist policy distillation via reinforce- ment learning.ArXiv, abs/2412.09858, 2024. URLhttps://api.semanticscholar.org/CorpusID:274658369
-
[49]
Adina Yakefu, Bin Xie, Chongyang Xu, Enwen Zhang, Erjin Zhou, Fan Jia, Haitao Yang, Haoqiang Fan, Haowei Zhang, Hongyang Peng, et al. Robochallenge: Large-scale real-robot evaluation of embodied policies.arXiv preprint arXiv:2510.17950, 2025
-
[50]
World Action Models are Zero-shot Policies
Seonghyeon Ye, Yunhao Ge, Kaiyuan Zheng, Shenyuan Gao, Sihyun Yu, George Kurian, Suneel Indupuru, You Liang Tan, Chuning Zhu, Jiannan Xiang, Ayaan Malik, Kyungmin Lee, et al. World action models are zero-shot policies. arXiv preprint arXiv:2602.15922, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[51]
JoyAI-RA 0.1: A Foundation Model for Robotic Autonomy
Tianle Zhang, Zhihao Yuan, Dafeng Chi, Peidong Liu, Dongwei Li, Kejun Hu, Likui Zhang, Junnan Nie, Ziming Wei, Zengjue Chen, Yili Tang, Jiayi Li, Zhiyuan Xiang, Mingyang Li, Tianci Luo, Hanwen Wan, Ao Li, Linbo Zhai, Zhihao Zhan, Xiaodong Bai, Jiakun Cai, Peng Cao, Kangliang Chen, Siang Chen, Yixiang Dai, Shuai Di, Yicheng Gong, Chenguang Gui, Yucheng Guo...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[52]
Ikdiffuser: a diffusion-based generative inverse kinematics solver for kinematic trees,
Zeyu Zhang and Ziyuan Jiao. Ikdiffuser: a diffusion-based generative inverse kinematics solver for kinematic trees,
- [53]
-
[54]
Egoscale: Scaling dexterous manipulation with diverse egocentric human data, 2026
Ruijie Zheng, Dantong Niu, Yuqi Xie, Jing Wang, Mengda Xu, Yunfan Jiang, Fernando Castañeda, Fengyuan Hu, You Liang Tan, Letian Fu, Trevor Darrell, Furong Huang, Yuke Zhu, Danfei Xu, and Linxi Fan. Egoscale: Scaling dexterous manipulation with diverse egocentric human data, 2026. URLhttps://arxiv.org/abs/2602.16710. 25
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.