FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning

Deepak Pathak; Jason Jingzhou Liu; Kenneth Shaw; Philip Han; Ruslan Salakhutdinov; Satoshi Funabashi; Steven Oh; Tony Tao

arxiv: 2606.12406 · v1 · pith:XMNPRTDKnew · submitted 2026-06-10 · 💻 cs.RO · cs.AI· cs.LG· cs.SY· eess.SY

FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning

Steven Oh , Jason Jingzhou Liu , Tony Tao , Philip Han , Kenneth Shaw , Satoshi Funabashi , Ruslan Salakhutdinov , Deepak Pathak This is my paper

Pith reviewed 2026-06-27 09:36 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.LGcs.SYeess.SY

keywords external torque estimationforce sensingcommodity robot armspolicy learningbehavior cloningcontact-rich manipulationdata-driven sensingforce feedback

0 comments

The pith

A neural network trained on 10 minutes of free-motion data estimates external torques on commodity robot arms as accurately as dedicated sensors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that external joint torque estimation can be learned without any contact-specific data or extra hardware. A model trained solely on free motion produces torque estimates that match dedicated sensors and supports force-feedback teleoperation. When this estimation is used to re-sample behavior cloning data around contact moments, policies achieve higher task progress on long-horizon manipulation tasks. The approach therefore removes the hardware barrier that has kept force-aware control limited to expensive arms.

Core claim

NEXT is a neural network that learns to predict external joint torques from robot state during free motion, reaching sensor-comparable accuracy after one minute of training on ten minutes of data. FIRST then uses these estimates to up-sample pre-contact and contact segments inside behavior cloning, yielding over 17 percent higher task progress than prior force-aware baselines across five long-horizon tasks. The combined system therefore supplies force awareness to off-the-shelf arms without added sensing hardware.

What carries the argument

Neural External Torque Estimation (NEXT) paired with Force-Informed Re-Sampling Training (FIRST), where NEXT supplies torque signals from free-motion data and FIRST re-weights imitation learning to emphasize contact phases.

If this is right

Low-cost arms gain force-feedback teleoperation without added sensors.
Behavior cloning policies improve when contact segments are up-sampled using the learned torque signal.
The same hardware can now handle five different long-horizon tasks with higher completion rates.
Force-aware manipulation becomes available on commodity platforms rather than only on research-grade arms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same free-motion training approach might extend to learning other contact signals such as slip or stiffness without dedicated hardware.
If the method works across arm models, it could allow rapid transfer of force-sensitive skills between different low-cost platforms.
Real-world deployment would still require checking whether torque estimates remain reliable when payloads or environments differ from the free-motion collection set.

Load-bearing premise

A network trained only on free-motion trajectories will produce accurate external torque estimates once the arm makes contact, without meaningful domain shift.

What would settle it

Measure external torques with dedicated sensors during contact-rich tasks and compare them directly to NEXT predictions; large consistent errors would show the free-motion training does not generalize.

Figures

Figures reproduced from arXiv: 2606.12406 by Deepak Pathak, Jason Jingzhou Liu, Kenneth Shaw, Philip Han, Ruslan Salakhutdinov, Satoshi Funabashi, Steven Oh, Tony Tao.

**Figure 1.** Figure 1: Overview of our approach. (a) Neural External Torque estimation (NEXT) produces high quality joint torque estimates using only 10 minutes of data without dedicated force sensors or explicit system-identification, enabling force-feedback teleoperation on low-cost arms, such as the Piper, YAM, and Nero. (b) Force-Informed Re-Sampling Training (FIRST) uses learned external torque estimates to segment demonstr… view at source ↗

**Figure 2.** Figure 2: External Force Estimation Deployment. At deployment time, we first obtain the measured joint torque from multiplying each joint’s measured current by its torque constant K. We then use an LSTM trained on free-space data to estimate freespace torque, which is then subtracted from the measured joint torque to obtain external joint torque. We instantiate fθ as an LSTM-based sequence model, which maps the p… view at source ↗

**Figure 3.** Figure 3: FIRST is evaluated on five long-horizon, contact-rich tasks. Each task comprises multiple [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: User Study. Our external force estimate (using NEXT) improves teleoperation over baselines and remains comparable to FACTR Teleop, which relies on dedicated external force sensors. Lower joint torque applied corresponds to less unnecessary exertion. with 5 feedback conditions: no feedback, disturbance-observer based feedback (DO), leader-follower position-position feedback [44] (PP), FACTR Teleop using Fr… view at source ↗

**Figure 5.** Figure 5: Left: In free space, the external joint torque should remain zero. Our method produces an estimate that is less noisy than the external sensor and remains near zero, while FILIC and the Disturbance Observer drift away. Right: During contact, our estimate closely tracks the external sensor, whereas FILIC and the Disturbance Observer deviate substantially. 7 Results In our external force estimation experi… view at source ↗

**Figure 6.** Figure 6: Task progress across five contact-rich manipulation tasks using flow matching policy. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Default sampling yields higher validation loss on pre-contact and contact phases. By upsampling these phases during training, FIRST reduces their validation losses [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Distribution of the 10-minute free-motion dataset used to train NEXT on the Piper arm. [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

**Figure 9.** Figure 9: User study results comparing external torque feedback methods on the Piper. To evaluate NEXT in contact settings, we perform a force-feedback teleoperation user study. As shown in [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Contact phase segmentation from learned external torque. The normalized τext from NEXT is used to segment demonstrations into free motion, pre-contact, and contact phases. The image overlays show that the predicted labels align with the robot’s interaction state during the task. LEGO Assembly NIST Belt NIST Insertion Tool Clean Up Cap Screwing 0% 20% 40% 60% 80% 100% Task Progress Base Policy Base Policy … view at source ↗

**Figure 11.** Figure 11: Task progress across five contact-rich manipulation tasks using ACT policy. [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Moderate up-sampling generally improves performance, with best results around [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: Attention values during screw-cap manipulation. Learned external torque produces [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗

read the original abstract

Contact-rich manipulation requires force sensitivity, but many robot arms lack dedicated force sensors due to their high cost. We present Neural External Torque Estimation (NEXT), a data-driven method that estimates external joint torques without needing any dedicated force sensors. NEXT trains in 1 minute from only 10 minutes of free-motion data, yet achieves estimates comparable to dedicated joint-torque sensors. NEXT enables force-feedback teleoperation on low-cost arms and improves policy learning through Force-Informed Re-Sampling Training (FIRST), which up-samples pre-contact and contact segments during behavior cloning. Across five long-horizon tasks, FIRST outperforms prior force-aware policies by over 17% in task progress. Together, NEXT and FIRST bring force-aware teleoperation and policy learning to off-the-shelf robots without additional sensing hardware. Video results and code are available at https://jasonjzliu.com/factr2

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main pitch is a one-minute torque estimator from free-motion data plus contact resampling that claims a 17% policy boost, but the abstract leaves the contact generalization untested.

read the letter

The one thing to know is that NEXT trains an external torque model in a minute on ten minutes of free-motion data and claims parity with real joint-torque sensors, while FIRST uses those estimates to up-sample contact segments in behavior cloning and reports a 17% lift across five long-horizon tasks.

The concrete contribution is the short training recipe and the resampling step that avoids extra hardware. Commodity arms become usable for force-sensitive teleoperation and imitation without new sensors, and the authors release code plus video, which lowers the barrier for others to try it.

The soft spot is exactly the one the stress-test flagged: training happens only on free motion, yet the estimates are used during contact-rich phases where friction, compliance, and impacts appear. The abstract supplies no per-phase error numbers, no ablation on contact versus non-contact segments, and no statistical detail on the 17% figure or the baselines. Without those, it is difficult to judge whether the domain shift is small or fatal to the central claim.

This is for robotics groups working on contact-rich imitation on low-cost arms who already run behavior cloning and want a cheap force signal. A reader who needs a working implementation could extract value from the methods if the full experiments hold up.

I would send it to peer review because the idea is practical and the claims are falsifiable, but the current evidence is too thin to evaluate the key generalization step.

Referee Report

2 major / 1 minor

Summary. The paper presents Neural External Torque Estimation (NEXT), a data-driven neural network method that estimates external joint torques on commodity robot arms after training for 1 minute on only 10 minutes of free-motion data, claiming performance comparable to dedicated joint-torque sensors. It further introduces Force-Informed Re-Sampling Training (FIRST), which uses NEXT estimates to up-sample pre-contact and contact segments during behavior cloning. Across five long-horizon tasks, FIRST is reported to outperform prior force-aware policies by over 17% in task progress. The combined approach is positioned to enable force-feedback teleoperation and improved policy learning on off-the-shelf robots without additional hardware.

Significance. If the generalization from free-motion training to contact-rich scenarios holds and the reported gains are robust, the work would be significant for the robotics community by removing the cost barrier to force sensing on commodity arms. The minimal data and training time requirements represent a practical strength that could accelerate adoption in manipulation research and applications.

major comments (2)

[Results/Evaluation section] Results/Evaluation section: The claim that NEXT achieves estimates comparable to dedicated joint-torque sensors and enables the 17% policy improvement via FIRST is load-bearing on generalization to contact, yet the manuscript provides no quantitative error breakdown (e.g., RMSE or correlation) separating contact-rich segments from free-motion segments. This leaves the domain-shift concern unaddressed despite training occurring exclusively on free-motion data.
[Policy learning experiments] Policy learning experiments: The assertion that FIRST outperforms prior force-aware policies by over 17% in task progress across five tasks lacks reported baseline details, statistical tests, ablation on the role of NEXT estimates, or variance across runs, making it impossible to assess whether the gain is attributable to the force estimates or other factors.

minor comments (1)

The abstract states video and code availability at a URL, but the manuscript body should include a dedicated reproducibility section with hyperparameters, network architecture, and data collection protocol.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback. The comments highlight important areas for strengthening the evaluation of generalization and the policy learning results. We address each major comment below and will incorporate revisions to provide the requested quantitative details and statistical analyses.

read point-by-point responses

Referee: [Results/Evaluation section] Results/Evaluation section: The claim that NEXT achieves estimates comparable to dedicated joint-torque sensors and enables the 17% policy improvement via FIRST is load-bearing on generalization to contact, yet the manuscript provides no quantitative error breakdown (e.g., RMSE or correlation) separating contact-rich segments from free-motion segments. This leaves the domain-shift concern unaddressed despite training occurring exclusively on free-motion data.

Authors: We agree that an explicit quantitative error breakdown on contact-rich versus free-motion segments would more directly address potential domain shift. While the overall comparable performance to dedicated sensors and the downstream policy improvements on contact-rich tasks provide supporting evidence, we will add a dedicated analysis in the revised Results/Evaluation section. This will include RMSE, MAE, and Pearson correlation computed separately on free-motion segments and on segments containing contacts (identified via ground-truth force thresholds from the evaluation setup). We will also report these metrics on held-out contact data collected after training. This revision will make the generalization claim more rigorous. revision: yes
Referee: [Policy learning experiments] Policy learning experiments: The assertion that FIRST outperforms prior force-aware policies by over 17% in task progress across five tasks lacks reported baseline details, statistical tests, ablation on the role of NEXT estimates, or variance across runs, making it impossible to assess whether the gain is attributable to the force estimates or other factors.

Authors: We acknowledge these omissions limit interpretability. In the revised manuscript we will expand the Policy learning experiments section to include: (1) explicit descriptions and citations for all baseline force-aware policies, (2) statistical significance testing (paired t-tests with p-values and effect sizes) on task progress across the five tasks, (3) a full ablation isolating the contribution of NEXT-derived force estimates within FIRST (comparing against versions using no force upsampling or alternative heuristics), and (4) mean and standard deviation of task progress over at least five independent training runs with different random seeds to report variance. These additions will clarify the source of the reported gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical claims

full rationale

The paper describes a purely empirical, data-driven pipeline: NEXT is a neural network trained on 10 minutes of free-motion data to regress external torques, and FIRST is a re-sampling heuristic that uses those estimates to up-weight contact segments in behavior cloning. No derivation chain, first-principles equations, or uniqueness theorems are invoked; performance claims rest on reported training duration, hardware-sensor comparisons, and task-progress metrics across five tasks. These quantities are externally falsifiable through replication on physical robots and do not reduce to fitted parameters or self-citations by construction. The provided text contains no self-citation load-bearing steps or ansatz smuggling.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the generalization ability of neural networks from free-motion to contact regimes and on the effectiveness of contact-phase upsampling; no explicit numerical free parameters or new physical entities are introduced.

axioms (1)

domain assumption Neural networks trained on free-motion proprioceptive data can generalize to estimate external torques during physical contact.
This generalization is required for NEXT to replace dedicated sensors in contact-rich settings.

pith-pipeline@v0.9.1-grok · 5718 in / 1266 out tokens · 28768 ms · 2026-06-27T09:36:00.356967+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 10 canonical work pages

[1]

Gamma Force/Torque sensor.https://www.ati-ia.com/ products/ft/ft_models.aspx?id=gamma, 2026

ATI Industrial Automation. Gamma Force/Torque sensor.https://www.ati-ia.com/ products/ft/ft_models.aspx?id=gamma, 2026. Accessed: April 28, 2026

2026
[2]

6-Axis Force Torque Sensors for Robotics.https://www.botasys.com/ force-torque-sensors, 2024

Bota Systems. 6-Axis Force Torque Sensors for Robotics.https://www.botasys.com/ force-torque-sensors, 2024. Accessed: April 28, 2026

2024
[3]

H. Choi, J. E. Low, T. M. Huh, S. Hong, G. A. Uribe, K. A. Hoffmann, J. Di, T. G. Chen, A. A. Stanley, and M. R. Cutkosky. Coinft: A coin-sized, capacitive 6-axis force torque sensor for robotic applications.arXiv preprint arXiv:2503.19225, 2025

arXiv 2025
[4]

Bhirangi, T

R. Bhirangi, T. Hellebrekers, C. Majidi, and A. Gupta. Reskin:versatile, replaceable, lasting tactile skins. InCoRL, 2021

2021
[5]

M. Y . Cao, S. Laws, and F. R. y Baena. Six-axis force/torque sensors for robotics applications: A review.IEEE Sensors Journal, 21(24):27238–27251, 2021

2021
[6]

J. J. Liu, Y . Li, K. Shaw, T. Tao, R. Salakhutdinov, and D. Pathak. FACTR: Force-Attending Curriculum Training for Contact-Rich Policy Learning. InProceedings of Robotics: Science and Systems, LosAngeles, CA, USA, June 2025. doi:10.15607/RSS.2025.XXI.079

work page doi:10.15607/rss.2025.xxi.079 2025
[7]

S. Oh, T. Takahashi, C. C. Beltran-Hernandez, Y . Kuroda, and M. Hamaya. A soft wrist with anisotropic and selectable stiffness for robust robot learning in contact-rich manipulation,
[8]

URLhttps://arxiv.org/abs/2602.14434

arXiv
[9]

Buamanee, M

T. Buamanee, M. Kobayashi, Y . Uranishi, and H. Takemura. Bi-act: Bilateral control- based imitation learning via action chunking with transformer. In2024 IEEE Interna- tional Conference on Advanced Intelligent Mechatronics (AIM), pages 410–415, 2024. doi: 10.1109/AIM55361.2024.10637173

work page doi:10.1109/aim55361.2024.10637173 2024
[10]

Yamane, Y

K. Yamane, Y . Li, M. Konosu, K. Inami, J. Oaki, T. Tsuji, and S. Sakaino. Design and exper- imental validation of sensorless 4-channel bilateral teleoperation for low-cost manipulators,
[11]

URLhttps://arxiv.org/abs/2507.06174

arXiv
[12]

H. Shi, S. Hu, Y . Hou, W. Wang, K. Liu, and S. Song. Minimalist compliance control, 2026. URLhttps://arxiv.org/abs/2603.00913

arXiv 2026
[13]

Inami, K

K. Inami, K. Yamane, and S. Sakaino. Loss function considering dead zone for neural net- works, 2024. URLhttps://arxiv.org/abs/2402.00393. 9

arXiv 2024
[14]

A. Zhu, Y . Tanaka, F. Rafeedi, and D. Hong. Cycloidal quasi-direct drive actuator designs with learning-based torque estimation for legged robotics. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 1–8. IEEE, 2025

2025
[15]

K. M. Park, J. Kim, J. Park, and F. C. Park. Learning-based real-time detection of robot collisions without joint torque sensors.IEEE Robotics and Automation Letters, 6(1):103–110,
[16]

doi:10.1109/LRA.2020.3033269

work page doi:10.1109/lra.2020.3033269 2020
[17]

Liang and O

J. Liang and O. Kroemer. Contact localization for robot arms in motion without torque sensing. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6322–
[18]

K. M. Park, Y . Park, S. Yoon, and F. C. Park. Collision detection for robot manipulators using unsupervised anomaly detection algorithms.IEEE/ASME Transactions on Mechatronics, 27 (5):2841–2851, 2022. doi:10.1109/TMECH.2021.3119057

work page doi:10.1109/tmech.2021.3119057 2022
[19]

Yilmaz, J

N. Yilmaz, J. Y . Wu, P. Kazanzides, and U. Tumerdem. Neural network based inverse dy- namics identification and external force estimation on the da vinci research kit. In2020 IEEE International Conference on Robotics and Automation (ICRA), pages 1387–1393. IEEE, 2020

2020
[20]

Zhang, H

Z. Zhang, H. Xu, Z. Yang, C. Yue, Z. Lin, H.-a. Gao, Z. Wang, and H. Zhao. Elucidating the design space of torque-aware vision-language-action models. In J. Lim, S. Song, and H.-W. Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 of Proceedings of Machine Learning Research, pages 4019–4037. PMLR, 27–30 Sep 2025. URL https://pr...

2025
[21]

Kamijo, C

T. Kamijo, C. C. Beltran-Hernandez, and M. Hamaya. Learning variable compliance control from a few demonstrations for bimanual robot with haptic feedback teleoperation system. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

2024
[22]

J. Yu, H. Liu, Q. Yu, J. Ren, C. Hao, H. Ding, G. Huang, G. Huang, Y . Song, P. Cai, W. Zhang, and C. Lu. ForceVLA: Enhancing VLA models with a force-aware moe for contact-rich ma- nipulation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems,
[23]

URLhttps://openreview.net/forum?id=2845H8Ua5D
[24]

Z. He, H. Fang, J. Chen, H.-S. Fang, and C. Lu. Foar: Force-aware reactive policy for contact- rich robotic manipulation.IEEE Robotics and Automation Letters, 10(6):5625–5632, 2025. doi:10.1109/LRA.2025.3560871

work page doi:10.1109/lra.2025.3560871 2025
[25]

W. Liu, J. Wang, Y . Wang, W. Wang, and C. Lu. Forcemimic: Force-centric imitation learning with force-motion capture system for contact-rich manipulation. In2025 IEEE International Conference on Robotics and Automation (ICRA), 2025

2025
[26]

H. Ge, Y . Jia, Z. Li, Y . Li, Z. Chen, R. Huang, and G. Zhou. Filic: Dual-loop force-guided imitation learning with impedance torque control for contact-rich manipulation tasks, 2025. URLhttps://arxiv.org/abs/2509.17053

Pith/arXiv arXiv 2025
[27]

H. Xue, J. Ren, W. Chen, G. Zhang, Y . Fang, G. Gu, H. Xu, and C. Lu. Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation. InProceedings of Robotics: Science and Systems (RSS), 2025

2025
[28]

X. B. Peng, A. Kumar, G. Zhang, and S. Levine. Advantage-weighted regression: Simple and scalable off-policy reinforcement learning.arXiv preprint arXiv:1910.00177, 2019

Pith/arXiv arXiv 1910
[29]

Z. Wang, A. Novikov, K. Zolna, J. T. Springenberg, S. Reed, B. Shahriari, N. Siegel, J. Merel, C. Gulcehre, N. Heess, and N. de Freitas. Critic regularized regression.Advances in Neural Information Processing Systems, 33:7768–7778, 2020. 10

2020
[30]

A. Nair, A. Gupta, M. Dalal, and S. Levine. Awac: Accelerating online reinforcement learning with offline datasets.arXiv preprint arXiv:2006.09359, 2020

Pith/arXiv arXiv 2006
[31]

S. M. Xie, H. Pham, X. Dong, N. Du, H. Liu, Y . Lu, P. S. Liang, Q. V . Le, T. Ma, and A. W. Yu. Doremi: Optimizing data mixtures speeds up language model pretraining.Advances in Neural Information Processing Systems, 36:69798–69818, 2023

2023
[32]

Hejna, C

J. Hejna, C. A. Bhateja, Y . Jiang, K. Pertsch, and D. Sadigh. Remix: Optimizing data mix- tures for large scale imitation learning. In P. Agrawal, O. Kroemer, and W. Burgard, ed- itors,Proceedings of The 8th Conference on Robot Learning, volume 270 ofProceedings of Machine Learning Research, pages 145–164. PMLR, 06–09 Nov 2025. URLhttps: //proceedings.ml...

2025
[33]

Ilyas, S

A. Ilyas, S. M. Park, L. Engstrom, G. Leclerc, and A. Madry. Datamodels: Understanding predictions with data and data with predictions. InICML, 2022

2022
[34]

S. Dass, A. Khaddaj, L. Engstrom, A. Madry, A. Ilyas, and R. Mart ´ın-Mart´ın. DataMIL: Selecting data for robot imitation learning with datamodels. InThe F ourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum? id=AcTsKglDdh

2026
[35]

Hejna, S

J. Hejna, S. Mirchandani, A. Balakrishna, A. Xie, A. Wahid, J. Tompson, P. Sanketi, D. Shah, C. Devin, and D. Sadigh. Robot data curation with mutual information estimators. InRobotics: Science and Systems (RSS), 2025

2025
[36]

Zhang, Y

Y . Zhang, Y . Xie, H. Liu, R. Shah, M. Wan, L. Fan, and Y . Zhu. Scizor: Self-supervised data curation for large-scale imitation learning. InIEEE International Conference on Robotics and Automation (ICRA), 2026

2026
[37]

Y . J. Ma, J. Hejna, C. Fu, D. Shah, J. Liang, Z. Xu, S. Kirmani, P. Xu, D. Driess, T. Xiao, et al. Vision language models are in-context value learners. InThe Thirteenth International Conference on Learning Representations, 2024

2024
[38]

Q. Chen, J. Yu, M. Schwager, P. Abbeel, F. Shentu, and P. Wu. SARM: Stage-aware re- ward modeling for long horizon robot manipulation. InThe F ourteenth International Con- ference on Learning Representations, 2026. URLhttps://openreview.net/forum?id= aemqAxScl9

2026
[39]

A. S. Chen, A. M. Lessing, Y . Liu, and C. Finn. Curating demonstrations using online experi- ence. InProceedings of Robotics: Science and Systems, 2025. doi:10.15607/RSS.2025.XXI. 071

work page doi:10.15607/rss.2025.xxi 2025
[40]

C. Agia, R. Sinha, J. Yang, R. Antonova, M. Pavone, H. Nishimura, M. Itkina, and J. Bohg. Cupid: Curating data your robot loves with influence functions. In J. Lim, S. Song, and H.-W. Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 of Proceedings of Machine Learning Research, pages 2907–2932. PMLR, 27–30 Sep 2025. URL https:/...

2025
[41]

Jubien, M

A. Jubien, M. Gautier, and A. Janot. Dynamic identification of the kuka lwr robot using motor torques and joint torque sensors data. In19th IF AC World Congress, 2014

2014
[42]

Haddadin, A

S. Haddadin, A. De Luca, and A. Albu-Sch ¨affer. Robot collisions: A survey on detection, isolation, and identification.IEEE Transactions on Robotics, 33(6):1292–1312, 2017

2017
[43]

Linderoth, A

M. Linderoth, A. Stolt, A. Robertsson, and R. Johansson. Robotic force estimation using motor torques and modeling of low velocity friction disturbances. In2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3550–3556, 2013. 11

2013
[44]

N. M. Kircanski and A. A. Goldenberg. An experimental study of nonlinear stiffness, hystere- sis, and friction effects in robot joints with harmonic drives and torque sensors.The Interna- tional Journal of Robotics Research, 16(2):214–239, 1997

1997
[45]

Reuss, N

M. Reuss, N. van Duijkeren, R. Krug, P. Becker, V . Shaj, and G. Neumann. End-to-end learning of hybrid inverse dynamics models for precise and compliant impedance control. InRobotics: Science and Systems, 2022

2022
[46]

X. Liu, C. Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. InThe Eleventh International Conference on Learning Representations,
[47]

URLhttps://openreview.net/forum?id=XVjTT1nw5z
[48]

Mamedov and S

S. Mamedov and S. Mikhel. Practical aspects of model-based collision detection.Frontiers in Robotics and AI, 7:571574, 2020

2020
[49]

P. F. Hokayem and M. W. Spong. Bilateral teleoperation: An historical survey.Au- tomatica, 42(12):2035–2057, 2006. ISSN 0005-1098. doi:https://doi.org/10.1016/j. automatica.2006.06.027. URLhttps://www.sciencedirect.com/science/article/ pii/S0005109806002871

work page doi:10.1016/j 2035
[50]

Reuss, ¨O

M. Reuss, ¨O. E. Ya˘gmurlu, F. Wenzel, and R. Lioutikov. Multimodal diffusion transformer: Learning versatile behavior from multimodal goals. InProceedings of Robotics: Science and Systems, Delft, Netherlands, July 2024. doi:10.15607/RSS.2024.XX.121

work page doi:10.15607/rss.2024.xx.121 2024
[51]

T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. InProceedings of Robotics: Science and Systems, Daegu, Republic of Korea, July 2023. doi:10.15607/RSS.2023.XIX.016

work page doi:10.15607/rss.2023.xix.016 2023
[52]

C. Gaz, M. Cognetti, A. Oliva, P. Robuffo Giordano, and A. De Luca. Dynamic identifi- cation of the franka emika panda robot with retrieval of feasible parameters using penalty- based optimization.IEEE Robotics and Automation Letters, 4(4):4147–4154, 2019. doi: 10.1109/LRA.2019.2931248

work page doi:10.1109/lra.2019.2931248 2019
[53]

agx arm urdf: Agilex robot arm urdf models.https://github.com/ agilexrobotics/agx_arm_urdf, 2026

AgileX Robotics. agx arm urdf: Agilex robot arm urdf models.https://github.com/ agilexrobotics/agx_arm_urdf, 2026. Accessed: 2026-05-11

2026
[54]

P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. GELLO: A general, low-cost, and intuitive teleoperation framework for robot manipulators. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 12156–12163. IEEE, 2024. doi:10.1109/ IROS58592.2024.10801581

arXiv 2024
[55]

Sim ´eoni, H

O. Sim ´eoni, H. V . V o, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V . Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025

Pith/arXiv arXiv 2025
[56]

Shaw and D

K. Shaw and D. Pathak. Leap hand v2: Dexterous, low-cost anthropomorphic hybrid rigid soft hand for robot learning. In2nd Workshop on Dexterous Manipulation: Design, Perception and Control (RSS). 12 Appendix Videos of our results and code to recreate our system are available on our website athttps:// jasonjzliu.com/factr2 A NEXT Implementation Details A.1...

[1] [1]

Gamma Force/Torque sensor.https://www.ati-ia.com/ products/ft/ft_models.aspx?id=gamma, 2026

ATI Industrial Automation. Gamma Force/Torque sensor.https://www.ati-ia.com/ products/ft/ft_models.aspx?id=gamma, 2026. Accessed: April 28, 2026

2026

[2] [2]

6-Axis Force Torque Sensors for Robotics.https://www.botasys.com/ force-torque-sensors, 2024

Bota Systems. 6-Axis Force Torque Sensors for Robotics.https://www.botasys.com/ force-torque-sensors, 2024. Accessed: April 28, 2026

2024

[3] [3]

H. Choi, J. E. Low, T. M. Huh, S. Hong, G. A. Uribe, K. A. Hoffmann, J. Di, T. G. Chen, A. A. Stanley, and M. R. Cutkosky. Coinft: A coin-sized, capacitive 6-axis force torque sensor for robotic applications.arXiv preprint arXiv:2503.19225, 2025

arXiv 2025

[4] [4]

Bhirangi, T

R. Bhirangi, T. Hellebrekers, C. Majidi, and A. Gupta. Reskin:versatile, replaceable, lasting tactile skins. InCoRL, 2021

2021

[5] [5]

M. Y . Cao, S. Laws, and F. R. y Baena. Six-axis force/torque sensors for robotics applications: A review.IEEE Sensors Journal, 21(24):27238–27251, 2021

2021

[6] [6]

J. J. Liu, Y . Li, K. Shaw, T. Tao, R. Salakhutdinov, and D. Pathak. FACTR: Force-Attending Curriculum Training for Contact-Rich Policy Learning. InProceedings of Robotics: Science and Systems, LosAngeles, CA, USA, June 2025. doi:10.15607/RSS.2025.XXI.079

work page doi:10.15607/rss.2025.xxi.079 2025

[7] [7]

S. Oh, T. Takahashi, C. C. Beltran-Hernandez, Y . Kuroda, and M. Hamaya. A soft wrist with anisotropic and selectable stiffness for robust robot learning in contact-rich manipulation,

[8] [8]

URLhttps://arxiv.org/abs/2602.14434

arXiv

[9] [9]

Buamanee, M

T. Buamanee, M. Kobayashi, Y . Uranishi, and H. Takemura. Bi-act: Bilateral control- based imitation learning via action chunking with transformer. In2024 IEEE Interna- tional Conference on Advanced Intelligent Mechatronics (AIM), pages 410–415, 2024. doi: 10.1109/AIM55361.2024.10637173

work page doi:10.1109/aim55361.2024.10637173 2024

[10] [10]

Yamane, Y

K. Yamane, Y . Li, M. Konosu, K. Inami, J. Oaki, T. Tsuji, and S. Sakaino. Design and exper- imental validation of sensorless 4-channel bilateral teleoperation for low-cost manipulators,

[11] [11]

URLhttps://arxiv.org/abs/2507.06174

arXiv

[12] [12]

H. Shi, S. Hu, Y . Hou, W. Wang, K. Liu, and S. Song. Minimalist compliance control, 2026. URLhttps://arxiv.org/abs/2603.00913

arXiv 2026

[13] [13]

Inami, K

K. Inami, K. Yamane, and S. Sakaino. Loss function considering dead zone for neural net- works, 2024. URLhttps://arxiv.org/abs/2402.00393. 9

arXiv 2024

[14] [14]

A. Zhu, Y . Tanaka, F. Rafeedi, and D. Hong. Cycloidal quasi-direct drive actuator designs with learning-based torque estimation for legged robotics. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 1–8. IEEE, 2025

2025

[15] [15]

K. M. Park, J. Kim, J. Park, and F. C. Park. Learning-based real-time detection of robot collisions without joint torque sensors.IEEE Robotics and Automation Letters, 6(1):103–110,

[16] [16]

doi:10.1109/LRA.2020.3033269

work page doi:10.1109/lra.2020.3033269 2020

[17] [17]

Liang and O

J. Liang and O. Kroemer. Contact localization for robot arms in motion without torque sensing. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6322–

[18] [18]

K. M. Park, Y . Park, S. Yoon, and F. C. Park. Collision detection for robot manipulators using unsupervised anomaly detection algorithms.IEEE/ASME Transactions on Mechatronics, 27 (5):2841–2851, 2022. doi:10.1109/TMECH.2021.3119057

work page doi:10.1109/tmech.2021.3119057 2022

[19] [19]

Yilmaz, J

N. Yilmaz, J. Y . Wu, P. Kazanzides, and U. Tumerdem. Neural network based inverse dy- namics identification and external force estimation on the da vinci research kit. In2020 IEEE International Conference on Robotics and Automation (ICRA), pages 1387–1393. IEEE, 2020

2020

[20] [20]

Zhang, H

Z. Zhang, H. Xu, Z. Yang, C. Yue, Z. Lin, H.-a. Gao, Z. Wang, and H. Zhao. Elucidating the design space of torque-aware vision-language-action models. In J. Lim, S. Song, and H.-W. Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 of Proceedings of Machine Learning Research, pages 4019–4037. PMLR, 27–30 Sep 2025. URL https://pr...

2025

[21] [21]

Kamijo, C

T. Kamijo, C. C. Beltran-Hernandez, and M. Hamaya. Learning variable compliance control from a few demonstrations for bimanual robot with haptic feedback teleoperation system. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

2024

[22] [22]

J. Yu, H. Liu, Q. Yu, J. Ren, C. Hao, H. Ding, G. Huang, G. Huang, Y . Song, P. Cai, W. Zhang, and C. Lu. ForceVLA: Enhancing VLA models with a force-aware moe for contact-rich ma- nipulation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems,

[23] [23]

URLhttps://openreview.net/forum?id=2845H8Ua5D

[24] [24]

Z. He, H. Fang, J. Chen, H.-S. Fang, and C. Lu. Foar: Force-aware reactive policy for contact- rich robotic manipulation.IEEE Robotics and Automation Letters, 10(6):5625–5632, 2025. doi:10.1109/LRA.2025.3560871

work page doi:10.1109/lra.2025.3560871 2025

[25] [25]

W. Liu, J. Wang, Y . Wang, W. Wang, and C. Lu. Forcemimic: Force-centric imitation learning with force-motion capture system for contact-rich manipulation. In2025 IEEE International Conference on Robotics and Automation (ICRA), 2025

2025

[26] [26]

H. Ge, Y . Jia, Z. Li, Y . Li, Z. Chen, R. Huang, and G. Zhou. Filic: Dual-loop force-guided imitation learning with impedance torque control for contact-rich manipulation tasks, 2025. URLhttps://arxiv.org/abs/2509.17053

Pith/arXiv arXiv 2025

[27] [27]

H. Xue, J. Ren, W. Chen, G. Zhang, Y . Fang, G. Gu, H. Xu, and C. Lu. Reactive diffusion policy: Slow-fast visual-tactile policy learning for contact-rich manipulation. InProceedings of Robotics: Science and Systems (RSS), 2025

2025

[28] [28]

X. B. Peng, A. Kumar, G. Zhang, and S. Levine. Advantage-weighted regression: Simple and scalable off-policy reinforcement learning.arXiv preprint arXiv:1910.00177, 2019

Pith/arXiv arXiv 1910

[29] [29]

Z. Wang, A. Novikov, K. Zolna, J. T. Springenberg, S. Reed, B. Shahriari, N. Siegel, J. Merel, C. Gulcehre, N. Heess, and N. de Freitas. Critic regularized regression.Advances in Neural Information Processing Systems, 33:7768–7778, 2020. 10

2020

[30] [30]

A. Nair, A. Gupta, M. Dalal, and S. Levine. Awac: Accelerating online reinforcement learning with offline datasets.arXiv preprint arXiv:2006.09359, 2020

Pith/arXiv arXiv 2006

[31] [31]

S. M. Xie, H. Pham, X. Dong, N. Du, H. Liu, Y . Lu, P. S. Liang, Q. V . Le, T. Ma, and A. W. Yu. Doremi: Optimizing data mixtures speeds up language model pretraining.Advances in Neural Information Processing Systems, 36:69798–69818, 2023

2023

[32] [32]

Hejna, C

J. Hejna, C. A. Bhateja, Y . Jiang, K. Pertsch, and D. Sadigh. Remix: Optimizing data mix- tures for large scale imitation learning. In P. Agrawal, O. Kroemer, and W. Burgard, ed- itors,Proceedings of The 8th Conference on Robot Learning, volume 270 ofProceedings of Machine Learning Research, pages 145–164. PMLR, 06–09 Nov 2025. URLhttps: //proceedings.ml...

2025

[33] [33]

Ilyas, S

A. Ilyas, S. M. Park, L. Engstrom, G. Leclerc, and A. Madry. Datamodels: Understanding predictions with data and data with predictions. InICML, 2022

2022

[34] [34]

S. Dass, A. Khaddaj, L. Engstrom, A. Madry, A. Ilyas, and R. Mart ´ın-Mart´ın. DataMIL: Selecting data for robot imitation learning with datamodels. InThe F ourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum? id=AcTsKglDdh

2026

[35] [35]

Hejna, S

J. Hejna, S. Mirchandani, A. Balakrishna, A. Xie, A. Wahid, J. Tompson, P. Sanketi, D. Shah, C. Devin, and D. Sadigh. Robot data curation with mutual information estimators. InRobotics: Science and Systems (RSS), 2025

2025

[36] [36]

Zhang, Y

Y . Zhang, Y . Xie, H. Liu, R. Shah, M. Wan, L. Fan, and Y . Zhu. Scizor: Self-supervised data curation for large-scale imitation learning. InIEEE International Conference on Robotics and Automation (ICRA), 2026

2026

[37] [37]

Y . J. Ma, J. Hejna, C. Fu, D. Shah, J. Liang, Z. Xu, S. Kirmani, P. Xu, D. Driess, T. Xiao, et al. Vision language models are in-context value learners. InThe Thirteenth International Conference on Learning Representations, 2024

2024

[38] [38]

Q. Chen, J. Yu, M. Schwager, P. Abbeel, F. Shentu, and P. Wu. SARM: Stage-aware re- ward modeling for long horizon robot manipulation. InThe F ourteenth International Con- ference on Learning Representations, 2026. URLhttps://openreview.net/forum?id= aemqAxScl9

2026

[39] [39]

A. S. Chen, A. M. Lessing, Y . Liu, and C. Finn. Curating demonstrations using online experi- ence. InProceedings of Robotics: Science and Systems, 2025. doi:10.15607/RSS.2025.XXI. 071

work page doi:10.15607/rss.2025.xxi 2025

[40] [40]

C. Agia, R. Sinha, J. Yang, R. Antonova, M. Pavone, H. Nishimura, M. Itkina, and J. Bohg. Cupid: Curating data your robot loves with influence functions. In J. Lim, S. Song, and H.-W. Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 of Proceedings of Machine Learning Research, pages 2907–2932. PMLR, 27–30 Sep 2025. URL https:/...

2025

[41] [41]

Jubien, M

A. Jubien, M. Gautier, and A. Janot. Dynamic identification of the kuka lwr robot using motor torques and joint torque sensors data. In19th IF AC World Congress, 2014

2014

[42] [42]

Haddadin, A

S. Haddadin, A. De Luca, and A. Albu-Sch ¨affer. Robot collisions: A survey on detection, isolation, and identification.IEEE Transactions on Robotics, 33(6):1292–1312, 2017

2017

[43] [43]

Linderoth, A

M. Linderoth, A. Stolt, A. Robertsson, and R. Johansson. Robotic force estimation using motor torques and modeling of low velocity friction disturbances. In2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3550–3556, 2013. 11

2013

[44] [44]

N. M. Kircanski and A. A. Goldenberg. An experimental study of nonlinear stiffness, hystere- sis, and friction effects in robot joints with harmonic drives and torque sensors.The Interna- tional Journal of Robotics Research, 16(2):214–239, 1997

1997

[45] [45]

Reuss, N

M. Reuss, N. van Duijkeren, R. Krug, P. Becker, V . Shaj, and G. Neumann. End-to-end learning of hybrid inverse dynamics models for precise and compliant impedance control. InRobotics: Science and Systems, 2022

2022

[46] [46]

X. Liu, C. Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. InThe Eleventh International Conference on Learning Representations,

[47] [47]

URLhttps://openreview.net/forum?id=XVjTT1nw5z

[48] [48]

Mamedov and S

S. Mamedov and S. Mikhel. Practical aspects of model-based collision detection.Frontiers in Robotics and AI, 7:571574, 2020

2020

[49] [49]

P. F. Hokayem and M. W. Spong. Bilateral teleoperation: An historical survey.Au- tomatica, 42(12):2035–2057, 2006. ISSN 0005-1098. doi:https://doi.org/10.1016/j. automatica.2006.06.027. URLhttps://www.sciencedirect.com/science/article/ pii/S0005109806002871

work page doi:10.1016/j 2035

[50] [50]

Reuss, ¨O

M. Reuss, ¨O. E. Ya˘gmurlu, F. Wenzel, and R. Lioutikov. Multimodal diffusion transformer: Learning versatile behavior from multimodal goals. InProceedings of Robotics: Science and Systems, Delft, Netherlands, July 2024. doi:10.15607/RSS.2024.XX.121

work page doi:10.15607/rss.2024.xx.121 2024

[51] [51]

T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. InProceedings of Robotics: Science and Systems, Daegu, Republic of Korea, July 2023. doi:10.15607/RSS.2023.XIX.016

work page doi:10.15607/rss.2023.xix.016 2023

[52] [52]

C. Gaz, M. Cognetti, A. Oliva, P. Robuffo Giordano, and A. De Luca. Dynamic identifi- cation of the franka emika panda robot with retrieval of feasible parameters using penalty- based optimization.IEEE Robotics and Automation Letters, 4(4):4147–4154, 2019. doi: 10.1109/LRA.2019.2931248

work page doi:10.1109/lra.2019.2931248 2019

[53] [53]

agx arm urdf: Agilex robot arm urdf models.https://github.com/ agilexrobotics/agx_arm_urdf, 2026

AgileX Robotics. agx arm urdf: Agilex robot arm urdf models.https://github.com/ agilexrobotics/agx_arm_urdf, 2026. Accessed: 2026-05-11

2026

[54] [54]

P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. GELLO: A general, low-cost, and intuitive teleoperation framework for robot manipulators. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 12156–12163. IEEE, 2024. doi:10.1109/ IROS58592.2024.10801581

arXiv 2024

[55] [55]

Sim ´eoni, H

O. Sim ´eoni, H. V . V o, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V . Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025

Pith/arXiv arXiv 2025

[56] [56]

Shaw and D

K. Shaw and D. Pathak. Leap hand v2: Dexterous, low-cost anthropomorphic hybrid rigid soft hand for robot learning. In2nd Workshop on Dexterous Manipulation: Design, Perception and Control (RSS). 12 Appendix Videos of our results and code to recreate our system are available on our website athttps:// jasonjzliu.com/factr2 A NEXT Implementation Details A.1...