X-IONet: Cross-Platform Inertial Odometry Network for Pedestrian and Legged Robot

Changhao Chen; Dehan Shen

arxiv: 2511.08277 · v2 · submitted 2025-11-11 · 💻 cs.RO · cs.LG

X-IONet: Cross-Platform Inertial Odometry Network for Pedestrian and Legged Robot

Dehan Shen , Changhao Chen This is my paper

Pith reviewed 2026-05-17 23:50 UTC · model grok-4.3

classification 💻 cs.RO cs.LG

keywords inertial odometryIMU navigationpedestrian trackinglegged robotsexpert networksattention mechanismExtended Kalman Filter

0 comments

The pith

X-IONet uses a single IMU with rule-based expert selection and dual-stage attention to deliver accurate odometry for both pedestrians and legged robots.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents X-IONet to address the drop in performance that learning-based inertial odometry models suffer when moving from human walking data to the faster, more varied motions of quadruped robots. It adds a rule-based module that first identifies the motion platform from the IMU sequence and then routes the data to a matching expert network. Each expert network applies a dual-stage attention mechanism to model both long sequences of motion over time and the relationships between the sensor's three axes, while also producing an uncertainty estimate for each displacement prediction. These outputs are combined in an Extended Kalman Filter to maintain a consistent state estimate. Experiments across three datasets show consistent error reductions, indicating that explicit platform separation inside one framework can unify navigation for very different dynamic systems.

Core claim

X-IONet incorporates a rule-based expert selection module to classify motion platforms and route IMU sequences to platform-specific expert networks. The displacement prediction network features a dual-stage attention architecture that jointly models long-range temporal dependencies and inter-axis correlations, enabling accurate motion representation. It outputs both displacement and associated uncertainty, which are further fused through an Extended Kalman Filter (EKF) for robust state estimation.

What carries the argument

rule-based expert selection module that classifies IMU data as pedestrian or legged-robot motion and routes it to dedicated expert networks equipped with dual-stage attention

If this is right

A single IMU-based system can replace separate pedestrian and robot navigation pipelines.
Uncertainty estimates from the attention network directly improve the stability of EKF-based state tracking.
Error reductions observed on RoNIN, GrandTour, and Go2 datasets follow from the platform-specific routing.
The same architecture supports deployment in environments that mix human and quadruped motion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the classification rules to wheeled platforms or mixed teams of humans and robots would test the framework's broader applicability.
The dual-stage attention may capture general motion invariants that could reduce reliance on hand-crafted motion models in other sensor fusion tasks.
Collecting a larger set of edge-case IMU sequences could reveal whether the expert networks truly generalize beyond the current training distributions.

Load-bearing premise

The rule-based expert selection module can reliably classify IMU sequences as pedestrian or legged-robot motion, and the platform-specific expert networks generalize to new motion patterns not seen during training.

What would settle it

Performance collapse or frequent misclassification when the system encounters unseen gaits, speeds, or surface conditions not represented in the training sets.

Figures

Figures reproduced from arXiv: 2511.08277 by Changhao Chen, Dehan Shen.

**Figure 2.** Figure 2: Overall framework of the proposed X-IONet framework. The raw inertial data are rotated using the attitude estimated by EKF [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The quadruped robot used in the experiments. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Trajectory comparisons of partial experimental results. The top three trajectory plots illustrate the comparisons of different methods [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: The trajectory of the quadruped robot predicted using the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Learning-based inertial odometry has achieved remarkable progress in pedestrian navigation. However, extending these methods to quadruped robots remains challenging due to their distinct and highly dynamic motion patterns. Models that perform well on pedestrian data often experience severe degradation when deployed on legged platforms. To tackle this challenge, we introduce X-IONet, a cross-platform inertial odometry framework that operates solely using a single Inertial Measurement Unit (IMU). X-IONet incorporates a rule-based expert selection module to classify motion platforms and route IMU sequences to platform-specific expert networks. The displacement prediction network features a dual-stage attention architecture that jointly models long-range temporal dependencies and inter-axis correlations, enabling accurate motion representation. It outputs both displacement and associated uncertainty, which are further fused through an Extended Kalman Filter (EKF) for robust state estimation. Extensive experiments on the public RoNIN pedestrian dataset, the GrandTour quadruped dataset, and a self-collected Go2 quadruped dataset demonstrate that X-IONet achieves state-of-the-art performance, reducing ATE and RTE by 14.3% and 11.4% on RoNIN, 11.8% and 9.7% on GrandTour, and 52.8% and 41.3% on Go2. These results highlight X-IONet's effectiveness for accurate and robust inertial navigation across both human and legged robot platforms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

X-IONet routes IMU data through a rule-based classifier to platform-specific experts with dual attention, delivering reported error drops on pedestrian and quadruped datasets, though the classifier's reliability is thinly documented.

read the letter

The main point is that X-IONet adds a rule-based module to pick between pedestrian and legged-robot expert networks, then uses dual-stage attention to model temporal sequences and axis correlations in raw IMU data for displacement prediction before EKF fusion. This setup targets the gap where pedestrian-trained models fail on the jerkier motions of quadrupeds like the Go2. The paper shows this on RoNIN, GrandTour, and a new Go2 collection, with the largest gains on the robot data. That cross-platform routing plus the attention design is the concrete addition over prior pedestrian-only inertial work. The experiments give numbers that look usable for vision-free navigation on dynamic platforms. The rule-based selector is the part that carries the load, and the abstract gives no thresholds, accuracy figures, or tests on edge cases such as gait switches or added noise. Without those, it is hard to know whether the reported ATE and RTE cuts come from the attention layers or simply from correct routing on the test sets. Baseline details and split information are also missing from the summary, which leaves the size of the improvement open to interpretation. The EKF step is standard and does not add new claims. This paper is aimed at robotics groups that need IMU-only state estimation on both humans and legged machines. Anyone working on sensor fusion or domain adaptation for inertial data would find the routing idea worth checking. It deserves peer review because the problem is real and the experiments cover multiple platforms, even if the classification robustness needs tighter evidence.

Referee Report

2 major / 1 minor

Summary. The paper introduces X-IONet, a cross-platform inertial odometry framework using only a single IMU. It employs a rule-based expert selection module to classify motion platforms (pedestrian vs. legged robot) and route sequences to platform-specific expert networks. These networks use a dual-stage attention architecture to jointly model temporal dependencies and inter-axis correlations, predicting displacement and uncertainty that are fused via an Extended Kalman Filter. Experiments on the RoNIN pedestrian dataset, GrandTour quadruped dataset, and a self-collected Go2 quadruped dataset report state-of-the-art results with ATE/RTE reductions of 14.3%/11.4%, 11.8%/9.7%, and 52.8%/41.3% respectively.

Significance. If the performance gains are shown to be robust and not attributable to platform-specific routing artifacts or implementation details, the work would meaningfully extend learning-based inertial odometry to legged robots by addressing the domain gap in motion dynamics between humans and quadrupeds.

major comments (2)

[Abstract and Method Description] The rule-based expert selection module is load-bearing for the cross-platform claim, yet the manuscript supplies no explicit classification rules, thresholds, accuracy metrics, or ablation on misclassification rates (e.g., on gait transitions or out-of-distribution IMU patterns). Without these, the reported ATE/RTE improvements cannot be confidently attributed to the dual-stage attention architecture rather than correct expert routing.
[Experiments] The quantitative SOTA claims lack supporting details on baseline implementations, data splits, statistical testing, or controls for post-hoc tuning. This leaves the central performance improvements only partially supported, as the abstract provides no variance, absolute error values, or comparison methodology.

minor comments (1)

[Abstract] The abstract reports percentage reductions without accompanying absolute ATE/RTE values or standard deviations, which would aid assessment of practical impact.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We appreciate the referee's careful reading and address each major comment below, outlining the revisions we will incorporate to improve clarity and rigor.

read point-by-point responses

Referee: [Abstract and Method Description] The rule-based expert selection module is load-bearing for the cross-platform claim, yet the manuscript supplies no explicit classification rules, thresholds, accuracy metrics, or ablation on misclassification rates (e.g., on gait transitions or out-of-distribution IMU patterns). Without these, the reported ATE/RTE improvements cannot be confidently attributed to the dual-stage attention architecture rather than correct expert routing.

Authors: We agree that the rule-based expert selection module is central to the cross-platform contribution and that its details must be made explicit. In the revised manuscript, we will add a dedicated subsection describing the classification rules, including the specific IMU-derived features (e.g., acceleration variance thresholds and dominant frequency bands) and decision thresholds used to route sequences to the pedestrian or quadruped expert. We will also report classification accuracy on all three evaluation datasets and include an ablation that quantifies the effect of misclassification rates, with particular attention to gait transitions and out-of-distribution IMU segments. These additions will allow readers to separate the contributions of the routing mechanism from those of the dual-stage attention architecture. revision: yes
Referee: [Experiments] The quantitative SOTA claims lack supporting details on baseline implementations, data splits, statistical testing, or controls for post-hoc tuning. This leaves the central performance improvements only partially supported, as the abstract provides no variance, absolute error values, or comparison methodology.

Authors: We acknowledge that the experimental section requires greater transparency to support the reported improvements. The revised manuscript will expand the experimental protocol with full specifications of all baseline implementations (including code references or hyperparameter settings), the exact train/validation/test splits for RoNIN, GrandTour, and the Go2 dataset, and absolute ATE/RTE values accompanied by standard deviations across repeated runs. We will add statistical significance testing (e.g., paired Wilcoxon tests) and explicitly state that no post-hoc tuning was performed on held-out test sequences. These changes will provide a clearer and more reproducible basis for the quantitative claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from trained network on public datasets

full rationale

The paper presents an ML architecture (rule-based expert selection routing to platform-specific networks with dual-stage attention, plus EKF fusion) whose performance claims are measured via standard ATE/RTE metrics on held-out sequences from RoNIN, GrandTour, and Go2. No equations or derivations are supplied that reduce the reported gains to quantities defined by the authors' own fitted parameters or self-citations; the method description relies on conventional attention blocks and filtering without self-referential definitions or fitted-input-as-prediction patterns. The central claims therefore remain independent of the inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that IMU signals contain distinguishable platform-specific signatures and that learned attention can extract usable displacement information from raw inertial sequences.

free parameters (1)

rule-based classification thresholds
Thresholds or rules used by the expert selection module are chosen or tuned to separate pedestrian from quadruped motion patterns.

axioms (1)

domain assumption IMU measurements provide sufficient information to distinguish motion platform type and to predict displacement
Invoked by the design of the expert selection module and the displacement prediction network.

pith-pipeline@v0.9.0 · 5556 in / 1231 out tokens · 35615 ms · 2026-05-17T23:50:28.285482+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat embedding and orbit structure unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The displacement prediction network features a dual-stage attention architecture that jointly models long-range temporal dependencies and inter-axis correlations
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

rule-based expert selection module to classify motion platforms

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 2 internal anchors

[1]

Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem,

R. Clark, S. Wang, H. Wen, A. Markham, and N. Trigoni, “Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem,” inProceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017

work page 2017
[2]

Esvo2: Direct visual-inertial odometry with stereo event cameras,

J. Niu, S. Zhong, X. Lu, S. Shen, G. Gallego, and Y . Zhou, “Esvo2: Direct visual-inertial odometry with stereo event cameras,”IEEE Transactions on Robotics, 2025

work page 2025
[3]

Fast-lio2: Fast direct lidar- inertial odometry,

W. Xu, Y . Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar- inertial odometry,”IEEE Transactions on Robotics, vol. 38, no. 4, pp. 2053–2073, 2022

work page 2053
[4]

Hcto: Optimality-aware lidar inertial odometry with hybrid continuous time optimization for compact wearable mapping system,

J. Li, S. Yuan, M. Cao, T.-M. Nguyen, K. Cao, and L. Xie, “Hcto: Optimality-aware lidar inertial odometry with hybrid continuous time optimization for compact wearable mapping system,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 211, pp. 228–243, 2024

work page 2024
[5]

ig-lio: An incremental gicp- based tightly-coupled lidar-inertial odometry,

Z. Chen, Y . Xu, S. Yuan, and L. Xie, “ig-lio: An incremental gicp- based tightly-coupled lidar-inertial odometry,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1883–1890, 2024

work page 2024
[6]

Deep kalman filter: Simultaneous multi-sensor integration and modelling; a gnss/imu case study,

S. Hosseinyalamdary, “Deep kalman filter: Simultaneous multi-sensor integration and modelling; a gnss/imu case study,”sensors, vol. 18, no. 5, p. 1316, 2018

work page 2018
[7]

A gnss/ins integrated navigation compensation method based on cnn– gru+ irakf hybrid model during gnss outages,

X. Meng, H. Tan, P. Yan, Q. Zheng, G. Chen, and J. Jiang, “A gnss/ins integrated navigation compensation method based on cnn– gru+ irakf hybrid model during gnss outages,”IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–15, 2024

work page 2024
[8]

Deep learning for inertial positioning: A survey,

C. Chen and X. Pan, “Deep learning for inertial positioning: A survey,” IEEE transactions on intelligent transportation systems, vol. 25, no. 9, pp. 10 506–10 523, 2024

work page 2024
[9]

Ionet: Learning to cure the curse of drift in inertial odometry,

C. Chen, X. Lu, A. Markham, and N. Trigoni, “Ionet: Learning to cure the curse of drift in inertial odometry,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

work page 2018
[10]

Tlio: Tight learned inertial odometry,

W. Liu, D. Caruso, E. Ilg, J. Dong, A. I. Mourikis, K. Daniilidis, V . Kumar, and J. Engel, “Tlio: Tight learned inertial odometry,”IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5653–5660, 2020

work page 2020
[11]

Imunet: Efficient regres- sion architecture for inertial imu navigation and positioning,

B. Zeinali, H. Zanddizari, and M. J. Chang, “Imunet: Efficient regres- sion architecture for inertial imu navigation and positioning,”IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–13, 2024

work page 2024
[12]

Ctin: Robust contextual transformer network for inertial navigation,

B. Rao, E. Kazemi, Y . Ding, D. M. Shila, F. M. Tucker, and L. Wang, “Ctin: Robust contextual transformer network for inertial navigation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 5, 2022, pp. 5413–5421

work page 2022
[13]

imot: Inertial motion transformer for inertial navigation,

S. M. Nguyen, D. V . Le, and P. Havinga, “imot: Inertial motion transformer for inertial navigation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 6, 2025, pp. 6209– 6217

work page 2025
[14]

Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, & new methods,

S. Herath, H. Yan, and Y . Furukawa, “Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, & new methods,” in2020 IEEE international conference on robotics and automation (ICRA). IEEE, 2020, pp. 3146–3152

work page 2020
[15]

Learning inertial odometry for dynamic legged robot state estimation,

R. Buchanan, M. Camurri, F. Dellaert, and M. Fallon, “Learning inertial odometry for dynamic legged robot state estimation,” in Conference on robot learning. PMLR, 2022, pp. 1575–1584

work page 2022
[16]

Airio: Learning inertial odometry with enhanced imu feature observability,

Y . Qiu, C. Xu, Y . Chen, S. Zhao, J. Geng, and S. Scherer, “Airio: Learning inertial odometry with enhanced imu feature observability,” arXiv preprint arXiv:2501.15659, 2025

work page arXiv 2025
[17]

Learned inertial odometry for autonomous drone racing,

G. Cioffi, L. Bauersfeld, E. Kaufmann, and D. Scaramuzza, “Learned inertial odometry for autonomous drone racing,”IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2684–2691, 2023

work page 2023
[18]

Dido: Deep inertial quadrotor dynamical odometry,

K. Zhang, C. Jiang, J. Li, S. Yang, T. Ma, C. Xu, and F. Gao, “Dido: Deep inertial quadrotor dynamical odometry,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9083–9090, 2022

work page 2022
[19]

Enhancing vio robustness under sudden lighting variation: A learning- based imu dead-reckoning for uav localization,

D. Yang, H. Liu, X. Jin, J. Chen, C. Wang, X. Ding, and K. Xu, “Enhancing vio robustness under sudden lighting variation: A learning- based imu dead-reckoning for uav localization,”IEEE Robotics and Automation Letters, vol. 9, no. 5, pp. 4535–4542, 2024

work page 2024
[20]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016
[21]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

work page 1997
[22]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Bert: Pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186

work page 2019
[24]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[25]

Rio: Rotation-equivariance supervised learning of robust inertial odometry,

X. Cao, C. Zhou, D. Zeng, and Y . Wang, “Rio: Rotation-equivariance supervised learning of robust inertial odometry,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6614–6623

work page 2022
[26]

Eqnio: Subequivariant neural inertial odometry,

R. K. Jayanth, Y . Xu, Z. Wang, E. Chatzipantazis, D. Gehrig, and K. Daniilidis, “Eqnio: Subequivariant neural inertial odometry,”arXiv preprint arXiv:2408.06321, 2024

work page arXiv 2024
[27]

Neural inertial odometry from lie events,

R. K. Jayanth, Y . Xu, E. Chatzipantazis, K. Daniilidis, and D. Gehrig, “Neural inertial odometry from lie events,”arXiv preprint arXiv:2505.09780, 2025

work page arXiv 2025
[28]

Implicit self-augmentation and soft dominance prediction for pedestrian inertial localization,

Y . Li, Z. Shi, Y . Hou, L. Xie, H. Chen, Y . Yan, and E. Yin, “Implicit self-augmentation and soft dominance prediction for pedestrian inertial localization,”IEEE Transactions on Instrumentation and Measure- ment, 2025

work page 2025
[29]

Sensor data fusion for body state estimation in a hexapod robot with dynamical gaits,

P.-C. Lin, H. Komsuoglu, and D. E. Koditschek, “Sensor data fusion for body state estimation in a hexapod robot with dynamical gaits,” IEEE Transactions on Robotics, vol. 22, no. 5, pp. 932–943, 2006

work page 2006
[30]

The two-state implicit filter recursive estimation for mobile robots,

M. Bloesch, M. Burri, H. Sommer, R. Siegwart, and M. Hutter, “The two-state implicit filter recursive estimation for mobile robots,”IEEE Robotics and Automation Letters, vol. 3, no. 1, pp. 573–580, 2017

work page 2017
[31]

Multi-imu propri- oceptive odometry for legged robots,

S. Yang, Z. Zhang, B. Bokser, and Z. Manchester, “Multi-imu propri- oceptive odometry for legged robots,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023, pp. 774–779

work page 2023
[32]

Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting,

Y . Zhang and J. Yan, “Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting,” inThe eleventh international conference on learning representations, 2023

work page 2023

[1] [1]

Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem,

R. Clark, S. Wang, H. Wen, A. Markham, and N. Trigoni, “Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem,” inProceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017

work page 2017

[2] [2]

Esvo2: Direct visual-inertial odometry with stereo event cameras,

J. Niu, S. Zhong, X. Lu, S. Shen, G. Gallego, and Y . Zhou, “Esvo2: Direct visual-inertial odometry with stereo event cameras,”IEEE Transactions on Robotics, 2025

work page 2025

[3] [3]

Fast-lio2: Fast direct lidar- inertial odometry,

W. Xu, Y . Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar- inertial odometry,”IEEE Transactions on Robotics, vol. 38, no. 4, pp. 2053–2073, 2022

work page 2053

[4] [4]

Hcto: Optimality-aware lidar inertial odometry with hybrid continuous time optimization for compact wearable mapping system,

J. Li, S. Yuan, M. Cao, T.-M. Nguyen, K. Cao, and L. Xie, “Hcto: Optimality-aware lidar inertial odometry with hybrid continuous time optimization for compact wearable mapping system,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 211, pp. 228–243, 2024

work page 2024

[5] [5]

ig-lio: An incremental gicp- based tightly-coupled lidar-inertial odometry,

Z. Chen, Y . Xu, S. Yuan, and L. Xie, “ig-lio: An incremental gicp- based tightly-coupled lidar-inertial odometry,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1883–1890, 2024

work page 2024

[6] [6]

Deep kalman filter: Simultaneous multi-sensor integration and modelling; a gnss/imu case study,

S. Hosseinyalamdary, “Deep kalman filter: Simultaneous multi-sensor integration and modelling; a gnss/imu case study,”sensors, vol. 18, no. 5, p. 1316, 2018

work page 2018

[7] [7]

A gnss/ins integrated navigation compensation method based on cnn– gru+ irakf hybrid model during gnss outages,

X. Meng, H. Tan, P. Yan, Q. Zheng, G. Chen, and J. Jiang, “A gnss/ins integrated navigation compensation method based on cnn– gru+ irakf hybrid model during gnss outages,”IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–15, 2024

work page 2024

[8] [8]

Deep learning for inertial positioning: A survey,

C. Chen and X. Pan, “Deep learning for inertial positioning: A survey,” IEEE transactions on intelligent transportation systems, vol. 25, no. 9, pp. 10 506–10 523, 2024

work page 2024

[9] [9]

Ionet: Learning to cure the curse of drift in inertial odometry,

C. Chen, X. Lu, A. Markham, and N. Trigoni, “Ionet: Learning to cure the curse of drift in inertial odometry,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018

work page 2018

[10] [10]

Tlio: Tight learned inertial odometry,

W. Liu, D. Caruso, E. Ilg, J. Dong, A. I. Mourikis, K. Daniilidis, V . Kumar, and J. Engel, “Tlio: Tight learned inertial odometry,”IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5653–5660, 2020

work page 2020

[11] [11]

Imunet: Efficient regres- sion architecture for inertial imu navigation and positioning,

B. Zeinali, H. Zanddizari, and M. J. Chang, “Imunet: Efficient regres- sion architecture for inertial imu navigation and positioning,”IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–13, 2024

work page 2024

[12] [12]

Ctin: Robust contextual transformer network for inertial navigation,

B. Rao, E. Kazemi, Y . Ding, D. M. Shila, F. M. Tucker, and L. Wang, “Ctin: Robust contextual transformer network for inertial navigation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 5, 2022, pp. 5413–5421

work page 2022

[13] [13]

imot: Inertial motion transformer for inertial navigation,

S. M. Nguyen, D. V . Le, and P. Havinga, “imot: Inertial motion transformer for inertial navigation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 6, 2025, pp. 6209– 6217

work page 2025

[14] [14]

Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, & new methods,

S. Herath, H. Yan, and Y . Furukawa, “Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, & new methods,” in2020 IEEE international conference on robotics and automation (ICRA). IEEE, 2020, pp. 3146–3152

work page 2020

[15] [15]

Learning inertial odometry for dynamic legged robot state estimation,

R. Buchanan, M. Camurri, F. Dellaert, and M. Fallon, “Learning inertial odometry for dynamic legged robot state estimation,” in Conference on robot learning. PMLR, 2022, pp. 1575–1584

work page 2022

[16] [16]

Airio: Learning inertial odometry with enhanced imu feature observability,

Y . Qiu, C. Xu, Y . Chen, S. Zhao, J. Geng, and S. Scherer, “Airio: Learning inertial odometry with enhanced imu feature observability,” arXiv preprint arXiv:2501.15659, 2025

work page arXiv 2025

[17] [17]

Learned inertial odometry for autonomous drone racing,

G. Cioffi, L. Bauersfeld, E. Kaufmann, and D. Scaramuzza, “Learned inertial odometry for autonomous drone racing,”IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2684–2691, 2023

work page 2023

[18] [18]

Dido: Deep inertial quadrotor dynamical odometry,

K. Zhang, C. Jiang, J. Li, S. Yang, T. Ma, C. Xu, and F. Gao, “Dido: Deep inertial quadrotor dynamical odometry,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9083–9090, 2022

work page 2022

[19] [19]

Enhancing vio robustness under sudden lighting variation: A learning- based imu dead-reckoning for uav localization,

D. Yang, H. Liu, X. Jin, J. Chen, C. Wang, X. Ding, and K. Xu, “Enhancing vio robustness under sudden lighting variation: A learning- based imu dead-reckoning for uav localization,”IEEE Robotics and Automation Letters, vol. 9, no. 5, pp. 4535–4542, 2024

work page 2024

[20] [20]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016

[21] [21]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

work page 1997

[22] [22]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[23] [23]

Bert: Pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186

work page 2019

[24] [24]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[25] [25]

Rio: Rotation-equivariance supervised learning of robust inertial odometry,

X. Cao, C. Zhou, D. Zeng, and Y . Wang, “Rio: Rotation-equivariance supervised learning of robust inertial odometry,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6614–6623

work page 2022

[26] [26]

Eqnio: Subequivariant neural inertial odometry,

R. K. Jayanth, Y . Xu, Z. Wang, E. Chatzipantazis, D. Gehrig, and K. Daniilidis, “Eqnio: Subequivariant neural inertial odometry,”arXiv preprint arXiv:2408.06321, 2024

work page arXiv 2024

[27] [27]

Neural inertial odometry from lie events,

R. K. Jayanth, Y . Xu, E. Chatzipantazis, K. Daniilidis, and D. Gehrig, “Neural inertial odometry from lie events,”arXiv preprint arXiv:2505.09780, 2025

work page arXiv 2025

[28] [28]

Implicit self-augmentation and soft dominance prediction for pedestrian inertial localization,

Y . Li, Z. Shi, Y . Hou, L. Xie, H. Chen, Y . Yan, and E. Yin, “Implicit self-augmentation and soft dominance prediction for pedestrian inertial localization,”IEEE Transactions on Instrumentation and Measure- ment, 2025

work page 2025

[29] [29]

Sensor data fusion for body state estimation in a hexapod robot with dynamical gaits,

P.-C. Lin, H. Komsuoglu, and D. E. Koditschek, “Sensor data fusion for body state estimation in a hexapod robot with dynamical gaits,” IEEE Transactions on Robotics, vol. 22, no. 5, pp. 932–943, 2006

work page 2006

[30] [30]

The two-state implicit filter recursive estimation for mobile robots,

M. Bloesch, M. Burri, H. Sommer, R. Siegwart, and M. Hutter, “The two-state implicit filter recursive estimation for mobile robots,”IEEE Robotics and Automation Letters, vol. 3, no. 1, pp. 573–580, 2017

work page 2017

[31] [31]

Multi-imu propri- oceptive odometry for legged robots,

S. Yang, Z. Zhang, B. Bokser, and Z. Manchester, “Multi-imu propri- oceptive odometry for legged robots,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023, pp. 774–779

work page 2023

[32] [32]

Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting,

Y . Zhang and J. Yan, “Crossformer: Transformer utilizing cross- dimension dependency for multivariate time series forecasting,” inThe eleventh international conference on learning representations, 2023

work page 2023