{"work":{"id":"6fe159e0-fa73-481a-88d4-4719c15140be","openalex_id":null,"doi":null,"arxiv_id":"2304.13705","raw_key":null,"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","authors":null,"authors_text":"Tony Z. Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn","year":2023,"venue":"cs.RO","abstract":"Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleoperation interface. Imitation learning, however, presents its own challenges, particularly in high-precision domains: errors in the policy can compound over time, and human demonstrations can be non-stationary. To address these challenges, we develop a simple yet novel algorithm, Action Chunking with Transformers (ACT), which learns a generative model over action sequences. ACT allows the robot to learn 6 difficult tasks in the real world, such as opening a translucent condiment cup and slotting a battery with 80-90% success, with only 10 minutes worth of demonstrations. Project website: https://tonyzhaozh.github.io/aloha/","external_url":"https://arxiv.org/abs/2304.13705","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-06-29T08:43:14.990606+00:00","pith_arxiv_id":"2304.13705","created_at":"2026-05-09T05:50:25.652943+00:00","updated_at":"2026-06-29T08:43:14.990606+00:00","title_quality_ok":true,"display_title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","render_title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"},"hub":{"state":{"work_id":"6fe159e0-fa73-481a-88d4-4719c15140be","tier":"super_hub","tier_reason":"100+ Pith inbound or 10,000+ external citations","pith_inbound_count":166,"external_cited_by_count":null,"distinct_field_count":4,"first_pith_cited_at":"2023-10-16T17:57:23+00:00","last_pith_cited_at":"2026-06-26T09:13:16+00:00","author_build_status":"needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-29T09:08:32.549879+00:00","tier_text":"super_hub"},"tier":"super_hub","role_counts":[{"context_role":"background","n":36},{"context_role":"method","n":8},{"context_role":"baseline","n":4},{"context_role":"dataset","n":1},{"context_role":"other","n":1}],"polarity_counts":[{"context_polarity":"background","n":36},{"context_polarity":"use_method","n":7},{"context_polarity":"baseline","n":4},{"context_polarity":"unclear","n":2},{"context_polarity":"use_dataset","n":1}],"runs":{"ask_index":{"job_type":"ask_index","status":"succeeded","result":{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","claims":[{"claim_text":"Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleop","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"[68] Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, and Marco Tagliasacchi. Soundstream: An end-to-end neural audio codec, 2021. URL https://arxiv. org/abs/2107.03312. [69] Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705 , 2023. [70] Tony Z Zhao, Jonathan Tompson, Danny Driess, Pete Florence, Kamyar Ghasemipour, Chelsea Finn, and Ayzaan Wahid. Aloha unleashed: A simple rec","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"Pt =F p(Condp(ϕp(At−1), F vis t )) (1) whereϕ(·)projects the past trajectory into feature space, and Cond(·)is a feature fusion operator which degrades to outputFvis t directly when there is no past estimate (t= 0). 3.3 Instantiations To demonstrate the versatility of X-Imitator, we instantiate the action branch using three representative visuomotor policies: DP3 [72], ACT [75], and RISE [59]. For the pose branch, we implement it as a lightweight diffusion head [12,27] for simplicity. While thes","claim_type":"method","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"Our project page is at https://3dgen4robot.github.io. Index Terms-3D generation, embodied AI, robotic simulation, scene generation, sim-to-real transfer ✦ 1 INTRODUCTION E MBODIEDAI and robotic systems are increasingly ex- pected to perceive, reason, and act in open-ended phys- ical environments [1], [2]. Recent progress in large-scale policy learning [3], [4], vision-language-action models [5]- [9], and high-fidelity simulation [10]-[12] has significantly expanded what these systems can do. How","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"shared robot learning harness: a substrate of typed contracts, chambered execution, and uniform transport, plus a content layer of Guides, Sensors, and State. This changes the unit of work from pairwise integration to reusable onboarding, reducing the burden toΘ(N+M+K). (W AM)), benchmark suitesB (e.g., LIBERO [9], RoboCasa [10], ManiSkill [11], ALOHA [12], etc.), and robot embodiments R (e.g., single-arm, bimanual, dexterous-hand, locomotion, humanoid, etc.), with cardinalities N, M, and K, res","claim_type":"dataset","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"its \"Alternating Condition Injection\" scheme [27], but shows a limitation in handling closed-loop feedback. As visualized in Figure 6, it often fails to correct mistakes in the \"scoop X into *We use this FiLM-EfficientNet implementation only for language- dependent tasks (\"scoop X into bowl\" and \"put X into pot\"). For clothes folding tasks, we use the original ResNet-18 [10] backbone as in [57]. Fig. 5: ALOHA language following results. Success rates in approaching language-specified target obje","claim_type":"method","confidence":0.85,"evidence_strength":"citation_context"},{"claim_text":"In Conference on Robot Learning, pages 1723-1736. PMLR, 2023. [17] Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems , 28, 2015. [18] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. [19] Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware. ","claim_type":"method","confidence":0.8,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware because it crossed a citation-hub threshold. Current citing contexts most often use it as background evidence (7 contexts).","role_counts":[{"n":7,"context_role":"background"},{"n":4,"context_role":"method"},{"n":1,"context_role":"dataset"}]},"error":null,"updated_at":"2026-05-15T20:47:59.915196+00:00"},"author_expand":{"job_type":"author_expand","status":"succeeded","result":{"authors_linked":[{"id":"374beb60-b37d-4aba-b168-380d605e39cd","orcid":null,"display_name":"Tony Z. Zhao"},{"id":"217e0c5c-bc77-488e-a9e0-4f47d7c8ea60","orcid":null,"display_name":"Vikash Kumar"},{"id":"ee56e6c3-f424-4a4b-852d-8ab2deb6dc65","orcid":null,"display_name":"Sergey Levine"},{"id":"379e406e-0cbc-4ede-b9dd-9a76a16a6da8","orcid":null,"display_name":"Chelsea Finn"}]},"error":null,"updated_at":"2026-05-15T20:48:00.098983+00:00"},"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T06:37:31.930066+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"$\\pi_0$: A Vision-Language-Action Flow Model for General Robot Control","work_id":"f790abdc-a796-482f-a40d-f8ee035ecfc2","shared_citers":48},{"title":"OpenVLA: An Open-Source Vision-Language-Action Model","work_id":"3e7e65c5-5aed-4fe9-8414-2092bcb31cc7","shared_citers":44},{"title":"$\\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization","work_id":"d1ad7304-d09a-49bc-809e-846439f6aff9","shared_citers":34},{"title":"RT-1: Robotics Transformer for Real-World Control at Scale","work_id":"e11bda85-8531-46bc-a07f-d0ade3643ab1","shared_citers":29},{"title":"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control","work_id":"ff438a8a-8003-4fae-9131-acd418b3597b","shared_citers":23},{"title":"GR00T N1: An Open Foundation Model for Generalist Humanoid Robots","work_id":"e2db69c7-ee8a-4cb7-a761-7b8de1dfcf97","shared_citers":22},{"title":"Octo: An Open-Source Generalist Robot Policy","work_id":"f9ca0722-8855-48c3-a27a-0eefb7e19253","shared_citers":22},{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","work_id":"62f0fb6c-e6ae-4dc4-95a4-d9dd64b240e8","shared_citers":18},{"title":"Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success","work_id":"04f46bb3-4346-47e8-bf09-c75d91f96e87","shared_citers":17},{"title":"RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation","work_id":"9b985126-4a2f-4bdf-b014-2a7524ec634e","shared_citers":17},{"title":"DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset","work_id":"13253de2-3d89-415c-8c2f-3adb25d4c337","shared_citers":15},{"title":"FAST: Efficient Action Tokenization for Vision-Language-Action Models","work_id":"83a8f966-6cfa-4f21-81f3-87440aae238f","shared_citers":15},{"title":"SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics","work_id":"0c5e9314-5fa7-4613-ad12-605a71d561d2","shared_citers":15},{"title":"RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation","work_id":"12319725-bc7d-4c32-a229-ad270a7460bc","shared_citers":14},{"title":"Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations","work_id":"62dbe235-8473-4190-8686-17e7437de50f","shared_citers":13},{"title":"3D-VLA: A 3D Vision-Language-Action Generative World Model","work_id":"aebf924c-e761-437e-9cee-f1ccc2e427bd","shared_citers":11},{"title":"SAM 2: Segment Anything in Images and Videos","work_id":"acc13f66-d814-44f9-9688-375688bf2d4a","shared_citers":11},{"title":"What Matters in Learning from Offline Human Demonstrations for Robot Manipulation","work_id":"6a4c95c5-540e-4854-946d-c7c8a6c540ba","shared_citers":11},{"title":"arXiv preprint arXiv:2403.03954 (2024)","work_id":"bded01e1-c070-4537-a75a-ace4c75d0c95","shared_citers":10},{"title":"Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets","work_id":"59e728c0-b6ca-4759-a8f4-02b981f2220f","shared_citers":10},{"title":"Do As I Can, Not As I Say: Grounding Language in Robotic Affordances","work_id":"037320f1-b0a9-4cbe-a639-bfb25409ce71","shared_citers":10},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":10},{"title":"LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning","work_id":"662203ad-084f-42c4-8e60-977b3173755b","shared_citers":10},{"title":"Mo- bile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation","work_id":"5f6ff8ef-ed80-4c00-92c2-361c80bf8448","shared_citers":10}],"time_series":[{"n":6,"year":2024},{"n":6,"year":2025},{"n":71,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T06:37:27.911502+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T06:37:40.440537+00:00"},"role_polarity":{"job_type":"role_polarity","status":"succeeded","result":{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","claims":[{"claim_text":"Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleop","claim_type":"abstract","evidence_strength":"source_metadata"},{"claim_text":"[68] Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, and Marco Tagliasacchi. Soundstream: An end-to-end neural audio codec, 2021. URL https://arxiv. org/abs/2107.03312. [69] Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware. arXiv preprint arXiv:2304.13705 , 2023. [70] Tony Z Zhao, Jonathan Tompson, Danny Driess, Pete Florence, Kamyar Ghasemipour, Chelsea Finn, and Ayzaan Wahid. Aloha unleashed: A simple rec","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"Pt =F p(Condp(ϕp(At−1), F vis t )) (1) whereϕ(·)projects the past trajectory into feature space, and Cond(·)is a feature fusion operator which degrades to outputFvis t directly when there is no past estimate (t= 0). 3.3 Instantiations To demonstrate the versatility of X-Imitator, we instantiate the action branch using three representative visuomotor policies: DP3 [72], ACT [75], and RISE [59]. For the pose branch, we implement it as a lightweight diffusion head [12,27] for simplicity. While thes","claim_type":"method","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"Our project page is at https://3dgen4robot.github.io. Index Terms-3D generation, embodied AI, robotic simulation, scene generation, sim-to-real transfer ✦ 1 INTRODUCTION E MBODIEDAI and robotic systems are increasingly ex- pected to perceive, reason, and act in open-ended phys- ical environments [1], [2]. Recent progress in large-scale policy learning [3], [4], vision-language-action models [5]- [9], and high-fidelity simulation [10]-[12] has significantly expanded what these systems can do. How","claim_type":"background","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"shared robot learning harness: a substrate of typed contracts, chambered execution, and uniform transport, plus a content layer of Guides, Sensors, and State. This changes the unit of work from pairwise integration to reusable onboarding, reducing the burden toΘ(N+M+K). (W AM)), benchmark suitesB (e.g., LIBERO [9], RoboCasa [10], ManiSkill [11], ALOHA [12], etc.), and robot embodiments R (e.g., single-arm, bimanual, dexterous-hand, locomotion, humanoid, etc.), with cardinalities N, M, and K, res","claim_type":"dataset","confidence":0.9,"evidence_strength":"citation_context"},{"claim_text":"its \"Alternating Condition Injection\" scheme [27], but shows a limitation in handling closed-loop feedback. As visualized in Figure 6, it often fails to correct mistakes in the \"scoop X into *We use this FiLM-EfficientNet implementation only for language- dependent tasks (\"scoop X into bowl\" and \"put X into pot\"). For clothes folding tasks, we use the original ResNet-18 [10] backbone as in [57]. Fig. 5: ALOHA language following results. Success rates in approaching language-specified target obje","claim_type":"method","confidence":0.85,"evidence_strength":"citation_context"},{"claim_text":"In Conference on Robot Learning, pages 1723-1736. PMLR, 2023. [17] Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems , 28, 2015. [18] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. [19] Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware. ","claim_type":"method","confidence":0.8,"evidence_strength":"citation_context"}],"why_cited":"Pith tracks Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware because it crossed a citation-hub threshold. Current citing contexts most often use it as background evidence (7 contexts).","role_counts":[{"n":7,"context_role":"background"},{"n":4,"context_role":"method"},{"n":1,"context_role":"dataset"}]},"error":null,"updated_at":"2026-05-15T20:47:59.912676+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","claims":[{"claim_text":"Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleop","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T06:37:40.504423+00:00"}},"summary":{"title":"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware","claims":[{"claim_text":"Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleop","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"$\\pi_0$: A Vision-Language-Action Flow Model for General Robot Control","work_id":"f790abdc-a796-482f-a40d-f8ee035ecfc2","shared_citers":48},{"title":"OpenVLA: An Open-Source Vision-Language-Action Model","work_id":"3e7e65c5-5aed-4fe9-8414-2092bcb31cc7","shared_citers":44},{"title":"$\\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization","work_id":"d1ad7304-d09a-49bc-809e-846439f6aff9","shared_citers":34},{"title":"RT-1: Robotics Transformer for Real-World Control at Scale","work_id":"e11bda85-8531-46bc-a07f-d0ade3643ab1","shared_citers":29},{"title":"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control","work_id":"ff438a8a-8003-4fae-9131-acd418b3597b","shared_citers":23},{"title":"GR00T N1: An Open Foundation Model for Generalist Humanoid Robots","work_id":"e2db69c7-ee8a-4cb7-a761-7b8de1dfcf97","shared_citers":22},{"title":"Octo: An Open-Source Generalist Robot Policy","work_id":"f9ca0722-8855-48c3-a27a-0eefb7e19253","shared_citers":22},{"title":"Open X-Embodiment: Robotic Learning Datasets and RT-X Models","work_id":"62f0fb6c-e6ae-4dc4-95a4-d9dd64b240e8","shared_citers":18},{"title":"Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success","work_id":"04f46bb3-4346-47e8-bf09-c75d91f96e87","shared_citers":17},{"title":"RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation","work_id":"9b985126-4a2f-4bdf-b014-2a7524ec634e","shared_citers":17},{"title":"DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset","work_id":"13253de2-3d89-415c-8c2f-3adb25d4c337","shared_citers":15},{"title":"FAST: Efficient Action Tokenization for Vision-Language-Action Models","work_id":"83a8f966-6cfa-4f21-81f3-87440aae238f","shared_citers":15},{"title":"SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics","work_id":"0c5e9314-5fa7-4613-ad12-605a71d561d2","shared_citers":15},{"title":"RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation","work_id":"12319725-bc7d-4c32-a229-ad270a7460bc","shared_citers":14},{"title":"Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations","work_id":"62dbe235-8473-4190-8686-17e7437de50f","shared_citers":13},{"title":"3D-VLA: A 3D Vision-Language-Action Generative World Model","work_id":"aebf924c-e761-437e-9cee-f1ccc2e427bd","shared_citers":11},{"title":"SAM 2: Segment Anything in Images and Videos","work_id":"acc13f66-d814-44f9-9688-375688bf2d4a","shared_citers":11},{"title":"What Matters in Learning from Offline Human Demonstrations for Robot Manipulation","work_id":"6a4c95c5-540e-4854-946d-c7c8a6c540ba","shared_citers":11},{"title":"arXiv preprint arXiv:2403.03954 (2024)","work_id":"bded01e1-c070-4537-a75a-ace4c75d0c95","shared_citers":10},{"title":"Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets","work_id":"59e728c0-b6ca-4759-a8f4-02b981f2220f","shared_citers":10},{"title":"Do As I Can, Not As I Say: Grounding Language in Robotic Affordances","work_id":"037320f1-b0a9-4cbe-a639-bfb25409ce71","shared_citers":10},{"title":"Flow Matching for Generative Modeling","work_id":"6edb71c4-5d64-40af-a394-9757ea051a36","shared_citers":10},{"title":"LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning","work_id":"662203ad-084f-42c4-8e60-977b3173755b","shared_citers":10},{"title":"Mo- bile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation","work_id":"5f6ff8ef-ed80-4c00-92c2-361c80bf8448","shared_citers":10}],"time_series":[{"n":6,"year":2024},{"n":6,"year":2025},{"n":71,"year":2026}],"dependency_candidates":[]},"authors":[{"id":"379e406e-0cbc-4ede-b9dd-9a76a16a6da8","orcid":null,"display_name":"Chelsea Finn","source":"manual","import_confidence":0.72},{"id":"ee56e6c3-f424-4a4b-852d-8ab2deb6dc65","orcid":null,"display_name":"Sergey Levine","source":"manual","import_confidence":0.72},{"id":"374beb60-b37d-4aba-b168-380d605e39cd","orcid":null,"display_name":"Tony Z. Zhao","source":"manual","import_confidence":0.72},{"id":"217e0c5c-bc77-488e-a9e0-4f47d7c8ea60","orcid":null,"display_name":"Vikash Kumar","source":"manual","import_confidence":0.72}]}}