pith. machine review for the scientific record. sign in

arxiv: 2604.09462 · v1 · submitted 2026-04-10 · 💻 cs.RO

Recognition: unknown

Adaptor: Advancing Assistive Teleoperation with Few-Shot Learning and Cross-Operator Generalization

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:01 UTC · model grok-4.3

classification 💻 cs.RO
keywords assistive teleoperationfew-shot learningcross-operator generalizationintent recognitionshared controltrajectory perturbationvision-language model fusionkeyframe extraction
0
0 comments X

The pith

Adaptor uses few-shot learning with trajectory perturbations and vision-language fusion to stabilize intent recognition across different teleoperators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Adaptor as a framework that counters the problem of diverse operator habits creating inconsistent robot control signals in shared-control teleoperation. It preprocesses input trajectories by injecting noise to model uncertainty and extracting geometry-aware keyframes, then encodes them through an Intention Expert before fusing the results with a pre-trained vision-language model to condition an Action Expert. This two-stage approach aims to close the domain gap between operators without requiring large amounts of per-user data. A sympathetic reader would care because stable cross-user performance could make assistive robots more practical for daily tasks where different people take turns controlling the same device.

Core claim

Adaptor bridges the domain gap caused by inter-operator variability in trajectory distributions through a preprocessing stage that synthesizes perturbations via noise injection and performs geometry-aware keyframe extraction, followed by a policy learning stage that encodes processed trajectories with an Intention Expert and fuses them with pre-trained vision-language model context to condition an Action Expert for action generation.

What carries the argument

The two-stage Adaptor pipeline: preprocessing via noise-injected trajectory perturbation and geometry-aware keyframe extraction, then policy learning that fuses an Intention Expert encoding with vision-language model context to condition the Action Expert.

If this is right

  • The method improves success rates and efficiency compared to existing baselines on both real-world and simulated assistive teleoperation tasks.
  • Performance variance remains low when the same system is used by operators with different levels of expertise.
  • The framework demonstrates robust generalization to new operators without additional per-user retraining.
  • State-of-the-art results hold across the tested benchmarks for shared-control intent recognition.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same preprocessing steps could be tested on other human-in-the-loop systems where user style varies, such as personalized interfaces or collaborative robots.
  • Combining synthetic perturbations with large pre-trained models may offer a general pattern for reducing data needs in human-robot interaction tasks.
  • If the keyframe extraction proves reliable, it might allow shorter training sessions when introducing a new user to the teleoperation setup.
  • The approach leaves open whether similar gains appear when the underlying robot platform or task type changes substantially.

Load-bearing premise

That noise injection on trajectories plus geometry-aware keyframe extraction, when fused with a pre-trained vision-language model, is sufficient to handle the full range of differences in how operators generate control signals.

What would settle it

Performance of Adaptor drops below baseline methods when tested on a new set of operators whose movement patterns fall outside the range of the injected noise perturbations and extracted keyframes.

Figures

Figures reproduced from arXiv: 2604.09462 by Fei Yan, Tianlv Huang, Wei Han, Weinan Hong, Xiangyu Chen, Xuan Song, Yihang Yin, Yuan Xu, Yue Cao, Yu Liu, Zipei Fan.

Figure 1
Figure 1. Figure 1: Evolution of teleoperation paradigms. Left: Direct teleoperation maps human inputs to robot commands but suffers from instability due to human-robot dynamic mismatches. Middle: Conventional assistance relies on expert demonstrations or fixed intent sets, often failing to generalize to diverse operator habits (inter-operator heterogeneity). Right: Adaptor (Ours) models intent uncertainty via trajectory pert… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the Adaptor framework. The architecture comprises two primary phases: (i) Preprocessing, where perturbation distributions and keyframes are extracted to model intent uncertainty; and (ii) Policy Learning. In this phase, the VLM backbone extracts environmental context, while the Intention Expert synthesizes semantic data with preprocessed trajectory guidance to infer latent intent, and the Actio… view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of the intent keyframe extraction. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overview of the experimental setup. Left: Representative tasks across three robotic platforms, including ALOHA simulator (Insertion, Cube Transfer), PIPER (Pen Uncapping, Shirt Folding), and Realman (Pen Organization, Cube Stacking). Right: Schematic of the teleoperation system architecture. TABLE I QUANTITATIVE EVALUATION ON REAL-WORLD AND SIMULATED BENCHMARKS. Robot Task Success Rate (%, ↑) Teleoperation… view at source ↗
Figure 5
Figure 5. Figure 5: Quantitative Analysis of User Satisfaction. Mean satisfaction scores derived from questionnaires administered after participants completed 30 trials for each task–method combination. Data are averaged across all tasks, with error bars representing the standard deviation (SD). semi-autonomous approaches. C. Ablation Studies 1) Ablation Study on Noise Injection: To evaluate whether injecting noise into the i… view at source ↗
read the original abstract

Assistive teleoperation enhances efficiency via shared control, yet inter-operator variability, stemming from diverse habits and expertise, induces highly heterogeneous trajectory distributions that undermine intent recognition stability. We present Adaptor, a few-shot framework for robust cross-operator intent recognition. The Adaptor bridges the domain gap through two stages: (i) preprocessing, which models intent uncertainty by synthesizing trajectory perturbations via noise injection and performs geometry-aware keyframe extraction; and (ii) policy learning, which encodes the processed trajectories with an Intention Expert and fuses them with the pre-trained vision-language model context to condition an Action Expert for action generation. Experiments on real-world and simulated benchmarks demonstrate that Adaptor achieves state-of-the-art performance, improving success rates and efficiency over baselines. Moreover, the method exhibits low variance across operators with varying expertise, demonstrating robust cross-operator generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper introduces Adaptor, a few-shot framework for assistive teleoperation that mitigates inter-operator variability in trajectory distributions. It employs a two-stage pipeline: (i) preprocessing via noise injection to synthesize trajectory perturbations and geometry-aware keyframe extraction, and (ii) policy learning that encodes processed trajectories with an Intention Expert, fuses them with pre-trained vision-language model context, and conditions an Action Expert for action generation. Experiments on real-world and simulated benchmarks are reported to achieve state-of-the-art success rates and efficiency while exhibiting low variance across operators of varying expertise, supporting robust cross-operator generalization.

Significance. If the empirical results hold under scrutiny, this work could meaningfully advance shared-control teleoperation by providing a practical mechanism for handling user-specific trajectory heterogeneity without extensive per-user retraining. The combination of perturbation-based data augmentation, geometry-aware processing, and VLM-conditioned few-shot adaptation addresses a persistent barrier in assistive robotics and could improve reliability in domains such as remote manipulation or rehabilitation robotics. The emphasis on cross-operator low variance is a particularly useful contribution for real-world deployment.

minor comments (3)
  1. Abstract: The claim of state-of-the-art performance would be strengthened by including at least one concrete quantitative result (e.g., success-rate delta or efficiency metric) and naming the primary baselines, even in condensed form.
  2. Method section: The fusion mechanism between the Intention Expert and the pre-trained VLM context into the Action Expert is described at a high level; adding a concise equation or diagram illustrating the conditioning step would improve reproducibility.
  3. Experiments: While low cross-operator variance is highlighted, the manuscript should explicitly state the number of operators, their expertise distribution, and the statistical test used to support the variance claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation for minor revision. The referee's description accurately captures Adaptor's two-stage pipeline, use of trajectory perturbation and vision-language conditioning, and emphasis on low-variance cross-operator performance. No major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents an empirical few-shot framework for assistive teleoperation consisting of a preprocessing stage (noise injection for trajectory perturbations and geometry-aware keyframe extraction) followed by policy learning (Intention Expert encoding fused with pre-trained VLM context to condition an Action Expert). No equations, derivations, or first-principles predictions appear in the abstract or method sketch. Performance claims rest entirely on reported experimental comparisons of success rates, efficiency, and cross-operator variance against baselines on real-world and simulated benchmarks. No self-definitional steps, fitted inputs renamed as predictions, load-bearing self-citations, or ansatz smuggling are present; the central sufficiency claim is tested directly by the benchmarks rather than reducing to the method's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities can be extracted. The central claim implicitly assumes that operator trajectory distributions can be adequately modeled by noise injection and that a pre-trained VLM provides useful conditioning without domain-specific fine-tuning.

pith-pipeline@v0.9.0 · 5473 in / 1127 out tokens · 42879 ms · 2026-05-10T17:01:34.115453+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

46 extracted references · 14 canonical work pages · 7 internal anchors

  1. [1]

    $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control

    K. Black, N. Brown, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, L. Groom, K. Hausman, B. Ichteret al., “π 0: A vision- language-action flow model for general robot control,”arXiv preprint arXiv:2410.24164, 2024

  2. [2]

    RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

    S. Liu, L. Wu, B. Li, H. Tan, H. Chen, Z. Wang, K. Xu, H. Su, and J. Zhu, “Rdt-1b: a diffusion foundation model for bimanual manipulation,”arXiv preprint arXiv:2410.07864, 2024

  3. [3]

    OpenVLA: An Open-Source Vision-Language-Action Model

    M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Foster, G. Lam, P. Sanketiet al., “Open- vla: An open-source vision-language-action model,”arXiv preprint arXiv:2406.09246, 2024

  4. [4]

    Rt-2: Vision-language-action models transfer web knowledge to robotic control,

    B. Zitkovich, T. Yu, S. Xu, P. Xu, T. Xiao, F. Xia, J. Wu, P. Wohlhart, S. Welker, A. Wahidet al., “Rt-2: Vision-language-action models transfer web knowledge to robotic control,” inConference on Robot Learning. PMLR, 2023, pp. 2165–2183

  5. [5]

    Open teach: A versatile teleoperation system for robotic manipulation,

    A. Iyer, Z. Peng, Y . Dai, I. Guzey, S. Haldar, S. Chintala, and L. Pinto, “Open teach: A versatile teleoperation system for robotic manipulation,”arXiv preprint arXiv:2403.07870, 2024

  6. [6]

    Mink: Python inverse kinematics based on MuJoCo,

    K. Zakka, “Mink: Python inverse kinematics based on MuJoCo,” Feb. 2026. [Online]. Available: https://github.com/kevinzakka/mink

  7. [7]

    A shared autonomy system for precise and efficient remote underwater manipulation,

    A. Phung, G. Billings, A. F. Daniele, M. R. Walter, and R. Camilli, “A shared autonomy system for precise and efficient remote underwater manipulation,”IEEE Transactions on Robotics, vol. 40, pp. 4147– 4159, Jan. 2024

  8. [8]

    Human-agent joint learning for efficient robot manipulation skill acquisition,

    S. Luo, Q. Peng, J. Lv, K. Hong, K. R. Driggs-Campbell, C. Lu, and Y .-L. Li, “Human-agent joint learning for efficient robot manipulation skill acquisition,”arXiv preprint arXiv:2407.00299, 2024

  9. [9]

    To the noise and back: Diffusion for shared autonomy,

    T. Yoneda, L. Sun, G. Yang, B. Stadie, and M. Walter, “To the noise and back: Diffusion for shared autonomy,”arXiv preprint arXiv:2302.12244, 2023

  10. [10]

    Dragon: A dialogue-based robot for assistive navigation with visual language grounding,

    S. Liu, A. Hasan, K. Hong, R. Wang, P. Chang, Z. Mizrachi, J. Lin, D. L. McPherson, W. A. Rogers, and K. Driggs-Campbell, “Dragon: A dialogue-based robot for assistive navigation with visual language grounding,”IEEE Robotics and Automation Letters, vol. 9, no. 4, pp. 3712–3719, 2024

  11. [11]

    Independence in the home: A wearable interface for a person with quadriplegia to teleoperate a mobile manipulator,

    A. Padmanabha, J. Gupta, C. Chen, J. Yang, V . Nguyen, D. J. Weber, C. Majidi, and Z. Erickson, “Independence in the home: A wearable interface for a person with quadriplegia to teleoperate a mobile manipulator,” inProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 2024, pp. 542–551

  12. [12]

    Balanced information gathering and goal- oriented actions in shared autonomy,

    C. Brooks and D. Szafir, “Balanced information gathering and goal- oriented actions in shared autonomy,” in2019 14th ACM/IEEE In- ternational Conference on Human-Robot Interaction (HRI), 2019, pp. 85–94

  13. [13]

    Autonomy in physical human-robot interaction: A brief survey,

    M. Selvaggio, M. Cognetti, S. Nikolaidis, S. Ivaldi, and B. Siciliano, “Autonomy in physical human-robot interaction: A brief survey,”IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7989–7996, 2021

  14. [14]

    Probabilistic human intent recognition for shared autonomy in assistive robotics,

    S. Jain and B. Argall, “Probabilistic human intent recognition for shared autonomy in assistive robotics,”ACM Transactions on Human- Robot Interaction, vol. 9, no. 1, 2020

  15. [15]

    Shared autonomy via hindsight optimization for teleopera- tion and teaming,

    S. Javdani, H. Admoni, S. Pellegrinelli, S. S. Srinivasa, and J. A. Bagnell, “Shared autonomy via hindsight optimization for teleopera- tion and teaming,”The International Journal of Robotics Research, vol. 37, no. 7, pp. 717–742, 2018

  16. [16]

    I know what you meant: Learning human objectives by (under) estimating their choice set,

    A. Jonnavittula and D. P. Losey, “I know what you meant: Learning human objectives by (under) estimating their choice set,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 2747–2753

  17. [17]

    HARMONIC: A multimodal dataset of assistive human– robot collaboration,

    B. A. Newman, R. M. Aronson, S. S. Srinivasa, K. Kitani, and H. Admoni, “HARMONIC: A multimodal dataset of assistive human– robot collaboration,”The International Journal of Robotics Research, vol. 41, no. 1, pp. 3–11, 2022

  18. [18]

    Asha: Assistive teleoperation via human-in-the-loop reinforcement learning,

    S. Chen, J. Gao, S. Reddy, G. Berseth, A. D. Dragan, and S. Levine, “Asha: Assistive teleoperation via human-in-the-loop reinforcement learning,” in2022 International Conference on Robotics and Automa- tion (ICRA). IEEE, 2022, pp. 7505–7512

  19. [19]

    Conformalized teleoperation: Confidently mapping human inputs to high-dimensional robot actions,

    M. Zhao, R. Simmons, H. Admoni, and A. Bajcsy, “Conformalized teleoperation: Confidently mapping human inputs to high-dimensional robot actions,”arXiv preprint arXiv:2406.07767, 2024

  20. [20]

    No, to the right: Online language corrections for robotic manipulation via shared autonomy,

    Y . Cui, S. Karamcheti, R. Palleti, N. Shivakumar, P. Liang, and D. Sadigh, “No, to the right: Online language corrections for robotic manipulation via shared autonomy,” inProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, 2023, pp. 93–101

  21. [21]

    Learning to share autonomy across repeated interaction,

    A. Jonnavittula and D. P. Losey, “Learning to share autonomy across repeated interaction,” in2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 1851–1858

  22. [22]

    Situational confidence assistance for lifelong shared autonomy,

    M. Zurek, A. Bobu, D. S. Brown, and A. D. Dragan, “Situational confidence assistance for lifelong shared autonomy,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 2783–2789

  23. [23]

    Sari: Shared autonomy across repeated interaction,

    A. Jonnavittula, S. A. Mehta, and D. P. Losey, “Sari: Shared autonomy across repeated interaction,”ACM Transactions on Human-Robot Interaction, vol. 13, no. 2, pp. 1–36, 2024

  24. [24]

    G., Rao, K., Yu, W., Fu, C., Gopalakrishnan, K., Xu, Z., et al

    J. Gu, S. Kirmani, P. Wohlhart, Y . Lu, M. G. Arenas, K. Rao, W. Yu, C. Fu, K. Gopalakrishnan, Z. Xuet al., “Rt-trajectory: Robotic task generalization via hindsight trajectory sketches,”arXiv preprint arXiv:2311.01977, 2023

  25. [25]

    Rt-sketch: Goal-conditioned imitation learning from hand-drawn sketches,

    P. Sundaresan, Q. Vuong, J. Gu, P. Xu, T. Xiao, S. Kirmani, T. Yu, M. Stark, A. Jain, K. Hausmanet al., “Rt-sketch: Goal-conditioned imitation learning from hand-drawn sketches,” in8th Annual Confer- ence on Robot Learning, 2024

  26. [26]

    Inferring human intent and predicting human action in human–robot collaboration,

    G. Hoffman, T. Bhattacharjee, and S. Nikolaidis, “Inferring human intent and predicting human action in human–robot collaboration,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 7, no. 1, pp. 73–95, 2024

  27. [27]

    Casper: Inferring diverse intents for assistive teleoperation with vision language models,

    H. Liu, R. Shah, S. Liu, J. Pittenger, M. Seo, Y . Cui, Y . Bisk, R. Mart´ın-Mart´ın, and Y . Zhu, “Casper: Inferring diverse intents for assistive teleoperation with vision language models,”arXiv preprint arXiv:2506.14727, 2025

  28. [28]

    Gemma 3 Technical Report

    G. Team, A. Kamath, J. Ferret, S. Pathak, N. Vieillard, R. Merhej, S. Perrin, T. Matejovicova, A. Ram ´e, M. Rivi `ereet al., “Gemma 3 technical report,” 2025. [Online]. Available: https://arxiv.org/abs/ 2503.19786

  29. [29]

    Stable-bc: Controlling covariate shift with stable behavior cloning,

    S. A. Mehta, Y . U. Ciftci, B. Ramachandran, S. Bansal, and D. P. Losey, “Stable-bc: Controlling covariate shift with stable behavior cloning,”IEEE Robotics and Automation Letters, 2025

  30. [30]

    Unexplored faces of robustness and out-of-distribution: Covariate shifts in environment and sensor domains,

    E. Baek, K. Park, J. Kim, and H.-S. Kim, “Unexplored faces of robustness and out-of-distribution: Covariate shifts in environment and sensor domains,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 22 294–22 303

  31. [31]

    Robust adaptive control of high-order fully-actuated systems: Command filtered backstepping with concurrent learning,

    W. Liu, G. Duan, M. Hou, and H. Kong, “Robust adaptive control of high-order fully-actuated systems: Command filtered backstepping with concurrent learning,”IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 71, no. 12, pp. 5780–5791, 2024

  32. [32]

    Persistence of excitation in linear systems,

    M. Green and J. B. Moore, “Persistence of excitation in linear systems,”Systems & control letters, vol. 7, no. 5, pp. 351–360, 1986

  33. [33]

    Efficient reductions for imitation learning,

    S. Ross and D. Bagnell, “Efficient reductions for imitation learning,” inProceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Pro- ceedings, 2010, pp. 661–668

  34. [34]

    A reduction of imitation learning and structured prediction to no-regret online learning,

    S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” inProceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 627–635

  35. [35]

    Inference-time policy steering through human interactions,

    Y . Wang, L. Wang, Y . Du, B. Sundaralingam, X. Yang, Y .-W. Chao, C. P ´erez-D’Arpino, D. Fox, and J. Shah, “Inference-time policy steering through human interactions,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 15 626–15 633

  36. [36]

    Hg-dagger: Interactive imitation learning with human experts,

    M. Kelly, C. Sidrane, K. Driggs-Campbell, and M. J. Kochenderfer, “Hg-dagger: Interactive imitation learning with human experts,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8077–8083

  37. [37]

    Dart: Noise injection for robust imitation learning,

    M. Laskey, J. Lee, R. Fox, A. Dragan, and K. Goldberg, “Dart: Noise injection for robust imitation learning,” inConference on robot learning. PMLR, 2017, pp. 143–156

  38. [38]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763

  39. [39]

    Blip-2: Bootstrapping language- image pre-training with frozen image encoders and large language models,

    J. Li, D. Li, S. Savarese, and S. Hoi, “Blip-2: Bootstrapping language- image pre-training with frozen image encoders and large language models,” inInternational conference on machine learning. PMLR, 2023, pp. 19 730–19 742

  40. [40]

    LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

    P. Gao, J. Han, R. Zhang, Z. Lin, S. Geng, A. Zhou, W. Zhang, P. Lu, C. He, X. Yueet al., “Llama-adapter v2: Parameter-efficient visual instruction model,”arXiv preprint arXiv:2304.15010, 2023

  41. [41]

    Hybridvla: Collaborative diffusion and autoregression in a unified vision-language-action model.ArXiv, abs/2503.10631, 2025

    J. Liu, H. Chen, P. An, Z. Liu, R. Zhang, C. Gu, X. Li, Z. Guo, S. Chen, M. Liuet al., “Hybridvla: Collaborative diffusion and au- toregression in a unified vision-language-action model,”arXiv preprint arXiv:2503.10631, 2025

  42. [42]

    $\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

    Physical Intelligence, K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusaiet al., “π 0.5: A vision-language-action model with open-world generalization,”arXiv preprint arXiv:2504.16054, 2025

  43. [43]

    Teleoperation of humanoid robots: A survey,

    K. Darvish, L. Penco, J. Ramos, R. Cisneros, J. Pratt, E. Yoshida, S. Ivaldi, and D. Pucci, “Teleoperation of humanoid robots: A survey,” IEEE Transactions on Robotics, vol. 39, no. 3, pp. 1706–1727, 2023

  44. [44]

    Lora: Low-rank adaptation of large language models

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.” Iclr, vol. 1, no. 2, p. 3, 2022

  45. [45]

    Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

    T. Zhao, V . Kumar, S. Levine, and C. Finn, “Learning fine- grained bimanual manipulation with low-cost hardware,”RSS, vol. abs/2304.13705, 2023

  46. [46]

    The sense of agency in assistive robotics using shared autonomy,

    M. A. Collier, R. Narayan, and H. Admoni, “The sense of agency in assistive robotics using shared autonomy,” in2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 2025, pp. 880–888