pith. sign in

arxiv: 2606.05501 · v1 · pith:G37CZCNPnew · submitted 2026-06-03 · 💻 cs.RO

Learning Contact Representation for Leg Odometry

Pith reviewed 2026-06-28 05:33 UTC · model grok-4.3

classification 💻 cs.RO
keywords legged robotscontact detectionself-supervised learningodometryjoint encodersstance phaseswing phaserepresentation learning
0
0 comments X

The pith

Self-supervised learning from joint encoders detects contact states for legged robot odometry without force sensors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that contact detection, essential for accurate leg odometry in legged robots, can be achieved using only joint encoder data through a self-supervised representation learning approach. This is important because force sensors are not always available and can miss disturbances like slippage during contact. The method learns representations to probabilistically model stance and swing phases and demonstrates better performance than supervised methods that require additional sensors and labeling, as well as standard probabilistic baselines.

Core claim

The self-supervised framework learns contact representations solely from joint encoder signals to model stance and swing phases probabilistically, achieving superior contact detection for odometry estimation compared to methods relying on force sensor augmentation or ground-truth labels.

What carries the argument

Self-supervised representation learning framework applied to joint encoder signals for probabilistic modeling of contact phases.

If this is right

  • Contact detection works without mounting force sensors at the foot.
  • No ground-truth labels or sensor set augmentation is needed for training.
  • Probabilistic modeling of stance and swing improves odometry feedback.
  • The approach handles unaccounted disturbances like slippage better than force-based methods.
  • Public availability of the code enables direct use on other legged robots.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Such methods could reduce hardware requirements across various legged robot designs.
  • Combining this with other sensor modalities might further improve robustness in challenging terrains.
  • Generalization to different robot morphologies could be tested by applying the framework to new platforms.
  • Long-term deployment might reveal needs for online adaptation of the learned representations.

Load-bearing premise

Joint encoder signals alone provide enough information to accurately learn contact states without force sensors or any ground-truth labels.

What would settle it

A controlled experiment on a legged robot performing stance phases with known slippage or external disturbances where the learned detector incorrectly classifies the contact state.

Figures

Figures reproduced from arXiv: 2606.05501 by Cagri Kilic, Emre Girgin.

Figure 1
Figure 1. Figure 1: Our framework learns the latent representations of the leg kinematics and model the latent [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Stance probabilities across contact detectors during a slippage event. The non-zero GRF [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Latent space projected onto PCA (top) and UMAP (bottom) dimensions. Colors indicate [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Feature Selection outputs ranking the predictive power of kinematic variables relative to [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

The estimation of odometry in legged robots depends on the assumption that the velocity of the foot with respect to the world remains zero during the stance phase. Feedback for the main body velocity is derived from the kinematic serial chain of the feet making accurate leg phase detection is a critical subproblem. A considerable number of studies employ ground reaction force sensors mounted at the tip of the foot to classify, yet these sensors may not be universally available for all legged robots. Additionally, these sensors are often unresponsive to unaccounted disturbances, such as slippage, while the foot remains in contact with the ground. In this study, we propose a self-supervised representation learning framework for contact detection that utilizes the standard sensor set of joint encoders without reliance on force sensor augmentations. We employ learned representations to model the stance and swing phases probabilistically. The experimental results obtained confirm the efficacy of the proposed self-supervised contact detector. Our framework exhibited superior performance in comparison to supervised methods which necessitate sensor set augmentation and labeling, as well as baseline probabilistic approaches. Additionally, we make our code available to the public.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes a self-supervised representation learning framework for contact detection in legged robot leg odometry that relies solely on joint encoder signals (positions and velocities), without force or ground-truth labels. Learned representations are used to probabilistically model stance and swing phases. Experiments are claimed to show superior performance compared to supervised methods requiring sensor augmentation and labeling, as well as baseline probabilistic approaches. The code is made publicly available.

Significance. If the central performance claim holds under the self-supervised objective, the work would be significant for legged robotics by enabling reliable contact estimation and odometry on platforms lacking GRF sensors, while addressing slippage issues that force sensors may miss. The self-supervised and code-release aspects strengthen potential impact and reproducibility.

major comments (1)
  1. [Abstract] Abstract: The central claim that joint encoder signals alone suffice for accurate self-supervised contact state separation (enabling better probabilistic modeling than labeled supervised methods) is load-bearing, yet the manuscript provides no analysis or section addressing whether kinematic signals can resolve ambiguities arising from compliance, backlash, or light contact during stance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and the opportunity to respond. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that joint encoder signals alone suffice for accurate self-supervised contact state separation (enabling better probabilistic modeling than labeled supervised methods) is load-bearing, yet the manuscript provides no analysis or section addressing whether kinematic signals can resolve ambiguities arising from compliance, backlash, or light contact during stance.

    Authors: The experimental evaluation is performed on physical legged robot platforms, where compliance, backlash, and varying contact conditions (including light contact) are inherent in the joint encoder data collected during locomotion. The self-supervised representations are shown to yield superior contact phase separation and downstream odometry accuracy relative to supervised baselines that rely on force sensors. This empirical outcome indicates that the kinematic signals, when processed through the learned probabilistic model, suffice to distinguish stance and swing despite the listed ambiguities; a dedicated theoretical section on each ambiguity type is not required to support the central claim given the real-world validation. revision: no

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents a self-supervised representation learning approach for contact detection from joint encoders, followed by probabilistic modeling of stance/swing phases and experimental validation. No equations, derivations, fitted parameters presented as predictions, or self-citations appear in the abstract or description. The central claims rest on empirical results rather than any self-referential reduction of outputs to inputs by construction. The framework is self-contained against external benchmarks with no load-bearing self-citation or ansatz smuggling visible.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that joint data suffices for contact classification.

pith-pipeline@v0.9.1-grok · 5713 in / 997 out tokens · 22541 ms · 2026-06-28T05:33:31.493214+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 9 canonical work pages · 4 internal anchors

  1. [1]

    Hartley, M

    R. Hartley, M. Ghaffari, R. M. Eustice, and J. W. Grizzle. Contact-aided invariant extended kalman filtering for robot state estimation.The International Journal of Robotics Research, 39 (4):402–430, 2020

  2. [2]

    Bloesch, M

    M. Bloesch, M. Hutter, M. A. Hoepflinger, S. Leutenegger, C. Gehring, C. D. Remy, and R. Siegwart. State estimation for legged robots-consistent fusion of leg kinematics and imu. Robotics, 17:17–24, 2013

  3. [3]

    In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems

    M. Bloesch, C. Gehring, P. Fankhauser, M. Hutter, M. A. Hoepflinger, and R. Siegwart. State estimation for legged robots on unstable and slippery terrain. In2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 6058–6064, 2013. doi:10.1109/IROS. 2013.6697236

  4. [4]

    OCELOT: Odometry and Contact Estimation for Legged Robots

    E. Girgin and C. Kilic. Ocelot: Odometry and contact estimation for legged robots, 2026. URL https://arxiv.org/abs/2605.21863

  5. [5]

    I. Skog, P. Handel, J.-O. Nilsson, and J. Rantakokko. Zero-velocity detection—an algorithm evaluation.IEEE Transactions on Biomedical Engineering, 57(11):2657–2666, 2010. doi: 10.1109/TBME.2010.2060723

  6. [6]

    Kilic, N

    C. Kilic, N. Ohi, Y . Gu, and J. N. Gross. Slip-based autonomous zupt through gaussian process to improve planetary rover localization.IEEE robotics and automation letters, 6(3):4782– 4789, 2021

  7. [7]

    Hohmeyer, M

    L. Hohmeyer, M. Popescu, I. Bergonzani, D. Mronga, and F. Kirchner. Inekformer: A hybrid state estimator for humanoid robots. In2025 IEEE International Conference on Advanced Robotics (ICAR), pages 833–840. IEEE, 2025

  8. [8]

    M. F. Fallon, M. Antone, N. Roy, and S. Teller. Drift-free humanoid state estimation fusing kinematic, inertial and lidar sensing. In2014 IEEE-RAS International Conference on Hu- manoid Robots, pages 112–119. IEEE, 2014

  9. [9]

    Rotella, S

    N. Rotella, S. Schaal, and L. Righetti. Unsupervised contact learning for humanoid estimation and control. In2018 IEEE International Conference on Robotics and Automation (ICRA), pages 411–417. IEEE, 2018

  10. [10]

    Bledt, M

    G. Bledt, M. J. Powell, B. Katz, J. Di Carlo, P. M. Wensing, and S. Kim. Mit cheetah 3: Design and control of a robust, dynamic quadruped robot. In2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2245–2252. IEEE, 2018

  11. [11]

    Hartley, J

    R. Hartley, J. Mangelson, L. Gan, M. G. Jadidi, J. M. Walls, R. M. Eustice, and J. W. Grizzle. Legged robot state-estimation through combined forward kinematic and preintegrated contact factors. In2018 IEEE International Conference on Robotics and Automation (ICRA), pages 4422–4429. IEEE, 2018

  12. [12]

    S. Teng, M. W. Mueller, and K. Sreenath. Legged robot state estimation in slippery environ- ments using invariant extended kalman filter with velocity update. In2021 IEEE International Conference on Robotics and Automation (ICRA), pages 3104–3110. IEEE, 2021

  13. [13]

    H. M. S. Santana, J. C. V . Soares, Y . Nistic `o, M. A. Meggiolaro, and C. Semini. Proprio- ceptive state estimation for quadruped robots using invariant kalman filtering and scale-variant robust cost functions. In2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids), pages 213–220. IEEE, 2024

  14. [14]

    K.-H. Kim, D. Ahn, D.-h. Lee, J. Yoon, and D. J. Hyun. Adaptive invariant extended kalman filter for legged robot state estimation. In2025 IEEE/RSJ International Conference on Intelli- gent Robots and Systems (IROS), pages 3063–3068. IEEE, 2025. 9

  15. [15]

    J.-H. Kim, S. Hong, G. Ji, S. Jeon, J. Hwangbo, J.-H. Oh, and H.-W. Park. Legged robot state estimation with dynamic contact event information.IEEE Robotics and Automation Letters, 6 (4):6733–6740, 2021

  16. [16]

    Yoon, J.-H

    Z. Yoon, J.-H. Kim, and H.-W. Park. Invariant smoother for legged robot state estimation with dynamic contact event information.IEEE Transactions on Robotics, 40:193–212, 2023

  17. [17]

    Camurri, M

    M. Camurri, M. Fallon, S. Bazeille, A. Radulescu, V . Barasuol, D. G. Caldwell, and C. Semini. Probabilistic contact estimation and impact detection for state estimation of quadruped robots. IEEE Robotics and Automation Letters, 2(2):1023–1030, 2017

  18. [18]

    Hwangbo, C

    J. Hwangbo, C. D. Bellicoso, P. Fankhauser, and M. Hutter. Probabilistic foot contact esti- mation by fusing information from dynamics and differential/forward kinematics. In2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3872–

  19. [19]

    Bledt, P

    G. Bledt, P. M. Wensing, S. Ingersoll, and S. Kim. Contact model fusion for event-based locomotion in unstructured terrains. In2018 IEEE International Conference on Robotics and Automation (ICRA), pages 4399–4406. IEEE, 2018

  20. [20]

    Jenelten, J

    F. Jenelten, J. Hwangbo, F. Tresoldi, C. D. Bellicoso, and M. Hutter. Dynamic locomotion on slippery ground.IEEE Robotics and Automation Letters, 4(4):4170–4176, 2019

  21. [21]

    Y . Wu, J. Kuang, S. Khorshidi, X. Niu, L. Klingbeil, M. Bennewitz, and H. Kuhlmann. Doglegs: Robust proprioceptive state estimation for legged robots using multiple leg-mounted imus.arXiv preprint arXiv:2503.04580, 2025

  22. [22]

    CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

    M. Baumgartner, D. M ¨uller, A. Serifi, R. Grandia, E. Knoop, M. Gross, and M. B¨acher. Coco- inekf: State estimation with learned contact covariances in dynamic, contact-rich scenarios. arXiv preprint arXiv:2605.15122, 2026

  23. [23]

    T.-Y . Lin, R. Zhang, J. Yu, and M. Ghaffari. Legged robot state estimation using invariant kalman filtering and learned contact events.arXiv preprint arXiv:2106.15713, 2021

  24. [24]

    G. Ji, J. Mun, H. Kim, and J. Hwangbo. Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion.IEEE Robotics and Automation Letters, 7(2):4630–4637, 2022

  25. [25]

    C. Yu, Y . Yang, T. Liu, Y . You, M. Zhou, and D. Xiang. State estimation transformers for agile legged locomotion. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6810–6817. IEEE, 2024

  26. [26]

    Radford, K

    A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al. Improving language understanding by generative pre-training. 2018

  27. [27]

    P. Sun, Q. Li, H. Hu, J. Qiang, W. Wu, and X. Luo. Proprioceptive slip detection and state estimation of multi-legged robots in slippery scenarios.Frontiers of Mechanical Engineering, 20(5):36, 2025

  28. [28]

    D. Youm, H. Oh, S. Choi, H. Kim, S. Jeon, and J. Hwangbo. Legged robot state estima- tion with invariant extended kalman filter using neural measurement network. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 670–676. IEEE, 2025

  29. [29]

    J. Sola. Quaternion kinematics for the error-state kalman filter.arXiv preprint arXiv:1711.02508, 2017

  30. [30]

    B. Jin, C. Sun, A. Zhang, N. Ding, J. Lin, G. Deng, Z. Zhu, and Z. Sun. Joint torque estimation toward dynamic and compliant control for gear-driven torque sensorless quadruped robot. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4630–4637. IEEE, 2019. 10

  31. [31]

    Zimek, E

    A. Zimek, E. Schubert, and H.-P. Kriegel. A survey on unsupervised outlier detection in high- dimensional numerical data.Statistical Analysis and Data Mining: The ASA Data Science Journal, 5(5):363–387, 2012

  32. [32]

    Patel, F

    M. Patel, F. Yang, Y . Qiu, C. Cadena, S. Scherer, M. Hutter, and W. Wang. Tartanground: A large-scale dataset for ground robot perception and navigation. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 20524–20531. IEEE, 2025

  33. [33]

    Sturm, N

    J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evalu- ation of rgb-d slam systems. In2012 IEEE/RSJ international conference on intelligent robots and systems, pages 573–580. IEEE, 2012

  34. [34]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    L. McInnes, J. Healy, and J. Melville. Umap: Uniform manifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018

  35. [35]

    J. Sola, J. Deray, and D. Atchuthan. A micro lie theory for state estimation in robotics.arXiv preprint arXiv:1812.01537, 2018

  36. [36]

    Hutter, C

    M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V . Tsounis, J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, et al. Anymal-a highly mobile and dynamic quadrupedal robot. In2016 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 38–44. IEEE, 2016. 11 A Error State Extended Kalman Filter Design Instead of directly ...

  37. [37]

    The measurement functionh(x k)computes the predicted world-frame velocity of the stance foot by superimposing the base velocityv W k with the body-relative foot velocityv B rel,i. For each footi, measurement JacobianH k =∂h/∂δx∈R 3×15 is formulated as: h(xk) =v W k +R kvB rel,i,v B rel,i =ω B ×p B i +J v,i(qi) ˙qi Hk = 03×3 I3×3 −Rk[vB rel,i]× 03×3 Rk[pB ...