MARIO: Motion-Augmented Real-Time Multi-Sensor Inertial Odometry

Chenfeng Gao; Karan Ahuja; Taeyoung Yeon; Vasco Xu; Xuanyou Liu; Yiquan Li

arxiv: 2606.02996 · v1 · pith:EFIVZROOnew · submitted 2026-06-02 · 💻 cs.RO · cs.CV· cs.HC

MARIO: Motion-Augmented Real-Time Multi-Sensor Inertial Odometry

Yiquan Li , Taeyoung Yeon , Chenfeng Gao , Vasco Xu , Xuanyou Liu , Karan Ahuja This is my paper

Pith reviewed 2026-06-28 10:12 UTC · model grok-4.3

classification 💻 cs.RO cs.CVcs.HC

keywords inertial odometrypose priorhuman motion dynamicssensor fusiondrift reductionAR glassesNymeria datasetmulti-sensor inertial tracking

0 comments

The pith

A learned IMU-inferred pose prior enforces human motion constraints to reduce inertial odometry drift by up to 36%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that current learning-based inertial odometry methods suffer from drift because they ignore human motion dynamics, and shows that inserting a learned pose prior derived from IMU signals grounds estimates in physically consistent kinematics. This integration into existing architectures cuts positional error on the large Nymeria dataset. Adding a fusion step that pulls in magnetometer, barometer, and secondary IMU readings from standard AR glasses pushes the improvement to 42 percent while increasing robustness across varied activities. Readers would care because the result points to reliable camera-free tracking on everyday wearables using only the sensors already present.

Core claim

Grounding inertial odometry in human kinematics through a learned IMU-inferred pose prior that promotes physically consistent motion constraints, then integrating this prior into existing IO architectures, reduces positional drift by up to 36 percent on the Nymeria dataset. A sensor-fusion framework that further incorporates auxiliary signals from magnetometers, barometers, and secondary IMUs reduces drift by up to 42 percent and improves robustness and generalization across diverse motion conditions.

What carries the argument

learned IMU-inferred pose prior that enforces physically consistent human motion constraints within the odometry estimation pipeline

If this is right

Positional drift is reduced by up to 36 percent when the pose prior is integrated into existing IO architectures on the Nymeria dataset.
A sensor-fusion framework using magnetometers, barometers, and secondary IMUs further reduces positional drift by up to 42 percent.
The fusion strategy improves robustness and generalization across diverse motion conditions.
The combined approach unifies human motion kinematics with multimodal sensing to set a new benchmark for camera-less human tracking.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pose-prior construction could be tested on other large human-motion datasets to check whether the 36 percent drift reduction holds beyond Nymeria.
The multi-sensor fusion layer might be extended to additional lightweight signals such as heart-rate or GPS when they become available on future AR hardware.
Longer tracking sessions without drift accumulation could support continuous applications such as indoor navigation or rehabilitation monitoring.

Load-bearing premise

The learned IMU-inferred pose prior accurately captures and enforces human motion dynamics without introducing new errors or biases into the odometry estimates.

What would settle it

Running the baseline IO architecture with and without the learned pose prior on the full Nymeria dataset and finding equal or higher average positional drift when the prior is included would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.02996 by Chenfeng Gao, Karan Ahuja, Taeyoung Yeon, Vasco Xu, Xuanyou Liu, Yiquan Li.

**Figure 3.** Figure 3: Visualization of altitude from the barometer compared [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 2.** Figure 2: Overview of the MARIO inertial odometry framework. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 5.** Figure 5: Trajectory visualizations of TLIO, TLIO+Pose, and [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: RTE-5s CDF on the Nymeria dataset for AirIO, TLIO, [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: We show the trajectory predictions alongside the ground truth for AirIO, TLIO, EqNIO, and RoNIN-LSTM on one sequence [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

Inertial odometry (IO) using only Inertial Measurement Units (IMUs) provides a lightweight solution for human motion tracking in augmented reality (AR) and wearable devices. Recent learning-based IO methods have improved the generalizability of inertial localization through large-scale pretraining on human motion datasets. However, these approaches remain prone to drift and noise because they do not explicitly capture human motion dynamics, especially on daily activity datasets such as Nymeria. In this work, we propose to ground inertial odometry in human kinematics through a learned IMU-inferred pose prior, which promotes physically consistent motion constraints. We integrate this pose prior into existing IO architectures and reduce positional drift by up to 36% on the challenging Nymeria dataset, which is 5x larger than datasets used in prior work. We further improve long-term performance with a sensor-fusion framework that incorporates auxiliary signals from lightweight sensors already available on commercial AR glasses, including magnetometers, barometers, and secondary IMUs. With this fusion strategy, positional drift is reduced by up to 42%, improving robustness and generalization across diverse motion conditions. Together, our results introduce a new paradigm for inertial and lightweight odometry by unifying human motion kinematics with multimodal sensing, setting a new benchmark for accurate and robust camera-less human tracking. Our website is available at https://spice-lab.org/projects/MARIO/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports solid drift reductions on a large daily-activity dataset by adding a learned pose prior and auxiliary sensors, but the abstract and stress-test note leave open whether the prior actually enforces kinematics or just adds modeling capacity.

read the letter

The core result here is a 36% drop in positional drift from the pose prior alone and 42% with added magnetometer, barometer, and secondary IMU signals on the Nymeria dataset. That dataset is five times larger than the ones used in the cited prior work, so the numbers matter for anyone doing camera-free tracking on wearables.

What the work does cleanly is take existing learning-based inertial odometry pipelines and bolt on two practical pieces: an IMU-derived pose prior meant to capture human motion constraints, plus a lightweight fusion step that uses sensors already on commercial AR glasses. The claim that this unifies kinematics with multimodal sensing is reasonable given the setup, and the website link suggests they may have released code or data.

The soft spot is exactly the one flagged in the stress-test note. The abstract gives final drift numbers but does not describe any check that trajectories satisfy independent kinematic invariants (foot velocity at contact, pelvis height bounds, joint limits) at higher rates than the baseline. Without an ablation that isolates the prior from extra network capacity or from the sensor fusion itself, the improvement could come from dataset correlations rather than enforced dynamics. If the full paper has those checks and they hold, the contribution strengthens; if not, the kinematic story is harder to defend.

This is the kind of incremental but usable paper that people working on AR and wearable inertial tracking will want to read. It is not reshaping the field, but the scale of the evaluation and the focus on existing hardware make it worth a referee's time. I would send it out for review rather than desk reject, with the expectation that the methods section needs to address the isolation of the prior's effect.

Referee Report

2 major / 2 minor

Summary. The paper presents MARIO, a learning-based inertial odometry framework that augments existing IO architectures with a learned IMU-inferred pose prior derived from human motion data to enforce physically consistent kinematic constraints. It reports up to 36% reduction in positional drift on the large-scale Nymeria dataset (5x larger than prior benchmarks) and up to 42% further improvement via fusion of auxiliary sensors (magnetometers, barometers, secondary IMUs) available on commercial AR glasses, claiming a new paradigm for camera-less, robust human tracking.

Significance. If the pose prior demonstrably supplies independent kinematic constraints rather than additional supervised capacity, the work would advance lightweight, drift-resistant odometry for AR/wearables by scaling to daily activities on substantially larger datasets and integrating readily available multimodal signals. The reported gains on Nymeria would be notable if supported by ablations isolating the prior's contribution.

major comments (2)

[Abstract and §4 (method description)] The central claim that the IMU-inferred pose prior 'promotes physically consistent motion constraints' (abstract) lacks supporting evidence: no evaluation shows that output trajectories satisfy independent kinematic invariants (e.g., near-zero foot-contact velocity, pelvis height bounds, or joint-angle limits) at higher rates than the baseline IO method, nor any ablation that isolates the prior from network capacity or from the auxiliary-sensor fusion module.
[Abstract and experimental results section] The reported 36% and 42% positional-drift reductions are presented without error bars, statistical significance tests, or details on data exclusion criteria and train/test splits on Nymeria; this makes it impossible to determine whether the gains are robust or could be explained by dataset-specific correlations rather than the kinematic prior.

minor comments (2)

[Abstract] The abstract states Nymeria is '5x larger than datasets used in prior work' but does not name the prior datasets or provide size comparisons in a table.
[Methods] Notation for the pose prior (e.g., how it is integrated as a loss term or constraint into the base IO architecture) should be formalized with an equation in the methods section for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline the revisions we will make.

read point-by-point responses

Referee: [Abstract and §4 (method description)] The central claim that the IMU-inferred pose prior 'promotes physically consistent motion constraints' (abstract) lacks supporting evidence: no evaluation shows that output trajectories satisfy independent kinematic invariants (e.g., near-zero foot-contact velocity, pelvis height bounds, or joint-angle limits) at higher rates than the baseline IO method, nor any ablation that isolates the prior from network capacity or from the auxiliary-sensor fusion module.

Authors: We agree that the current manuscript does not include direct evaluations of kinematic invariants (such as foot-contact velocity or pelvis height bounds) or ablations that isolate the pose prior from network capacity and the auxiliary-sensor fusion module. In the revised version, we will add quantitative comparisons of these kinematic metrics against baselines and controlled ablations that vary network capacity while holding other components fixed to isolate the prior's contribution. revision: yes
Referee: [Abstract and experimental results section] The reported 36% and 42% positional-drift reductions are presented without error bars, statistical significance tests, or details on data exclusion criteria and train/test splits on Nymeria; this makes it impossible to determine whether the gains are robust or could be explained by dataset-specific correlations rather than the kinematic prior.

Authors: We acknowledge that the reported improvements lack error bars, statistical significance tests, and explicit details on Nymeria data splits and exclusion criteria. In the revision, we will include error bars or confidence intervals on all reported metrics, conduct and report statistical significance tests, and provide full documentation of the train/test splits along with any exclusion criteria applied to the dataset. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical integration of learned prior with external validation

full rationale

The paper presents a learned IMU-inferred pose prior integrated into existing IO architectures, with reported positional drift reductions (36%/42%) on the external Nymeria dataset. No equations, self-citations, or parameter-fitting steps are exhibited that reduce the central claims to inputs by construction. The derivation chain consists of architectural integration and multimodal fusion whose outputs are evaluated against independent benchmarks rather than being definitionally equivalent to the training data or prior results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review is based solely on the abstract; no free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.1-grok · 5797 in / 1347 out tokens · 28087 ms · 2026-06-28T10:12:35.145530+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Rio: Rotation-equivariance supervised learning of robust in- ertial odometry

Xiya Cao, Caifa Zhou, Dandan Zeng, and Yongliang Wang. Rio: Rotation-equivariance supervised learning of robust in- ertial odometry. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 6614–6623, 2022. 2

2022
[2]

Ionet: Learning to cure the curse of drift in inertial odometry

Changhao Chen, Xiaoxuan Lu, Andrew Markham, and Niki Trigoni. Ionet: Learning to cure the curse of drift in inertial odometry. InProceedings of the AAAI conference on artifi- cial intelligence, 2018. 2

2018
[3]

Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem, 2017

Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, and Niki Trigoni. Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem, 2017. 1

2017
[4]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, et al. Project aria: A new tool for egocentric multi-modal ai research.arXiv preprint arXiv:2308.13561, 2023. 2, 4

work page internal anchor Pith review Pith/arXiv arXiv 2023
[5]

Visual-inertial navigation: A concise re- view

Guoquan Huang. Visual-inertial navigation: A concise re- view. In2019 international conference on robotics and au- tomation (ICRA), pages 9572–9582. IEEE, 2019. 3

2019
[6]

Deep iner- tial poser: Learning to reconstruct human pose from sparse inertial measurements in real time.ACM Transactions on Graphics (TOG), 37(6):1–15, 2018

Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael J Black, Otmar Hilliges, and Gerard Pons-Moll. Deep iner- tial poser: Learning to reconstruct human pose from sparse inertial measurements in real time.ACM Transactions on Graphics (TOG), 37(6):1–15, 2018. 3

2018
[7]

Eqnio: Subequivariant neural inertial odometry.arXiv preprint arXiv:2408.06321, 2024

Royina Karegoudra Jayanth, Yinshuang Xu, Ziyun Wang, Evangelos Chatzipantazis, Daniel Gehrig, and Kostas Dani- ilidis. Eqnio: Subequivariant neural inertial odometry.arXiv preprint arXiv:2408.06321, 2024. 1, 2, 3, 6

work page arXiv 2024
[8]

Transformer inertial poser: Real-time human motion reconstruction from sparse imus with simultaneous terrain generation

Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W Winkler, and C Karen Liu. Transformer inertial poser: Real-time human motion reconstruction from sparse imus with simultaneous terrain generation. InSIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022. 3

2022
[9]

A. R. Jimenez, F. Seco, C. Prieto, and J. Guevara. A compar- ison of pedestrian dead-reckoning algorithms using a low- cost mems imu. In2009 6th IEEE International Symposium on Intelligent Signal Processing, pages 37–42. IEEE, 2009. 2

2009
[10]

Jim ´enez, F

A.R. Jim ´enez, F. Seco, J.C. Prieto, and J. Guevara. Indoor pedestrian navigation using an ins/ekf framework for yaw drift reduction and a foot-mounted imu. In2010 7th Work- shop on Positioning, Navigation and Communication, pages 135–143, 2010. 2

2010
[11]

Vqf: Highly accurate imu orientation estimation with bias estimation and magnetic dis- turbance rejection.Information Fusion, 91:187–204, 2023

Daniel Laidig and Thomas Seel. Vqf: Highly accurate imu orientation estimation with bias estimation and magnetic dis- turbance rejection.Information Fusion, 91:187–204, 2023. 3

2023
[12]

Lidar odometry survey: recent advancements and re- maining challenges.Intelligent Service Robotics, 17(2):95– 118, 2024

Dongjae Lee, Minwoo Jung, Wooseong Yang, and Ayoung Kim. Lidar odometry survey: recent advancements and re- maining challenges.Intelligent Service Robotics, 17(2):95– 118, 2024. 3

2024
[13]

Mins: Efficient and robust multisensor-aided inertial navigation system, 2023

Woosik Lee, Patrick Geneva, Chuchu Chen, and Guoquan Huang. Mins: Efficient and robust multisensor-aided inertial navigation system, 2023. 3

2023
[14]

Ultraposer: Pushing the limits of imu-based full- body pose estimation with ultrasound sensing on consumer wearables

Yadong Li, Shuning Wang, Yongjian Fu, Justin Chen, 2 Xingyu Chen, Ju Ren, Xinyu Zhang, Akshay Gadre, and Ke Sun. Ultraposer: Pushing the limits of imu-based full- body pose estimation with ultrasound sensing on consumer wearables. InProceedings of the 38th Annual ACM Sym- posium on User Interface Software and Technology, pages 1–15, 2025. 3

2025
[15]

M2eit: Multi-domain mixture of experts for robust neural inertial tracking

Yan Li, Yang Xu, Changhao Chen, Zhongchen Shi, Wei Chen, Liang Xie, Hongbo Chen, and Erwei Yin. M2eit: Multi-domain mixture of experts for robust neural inertial tracking. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 28207– 28216, 2025. 2

2025
[16]

Mourikis, Kostas Daniilidis, Vijay Kumar, and Jakob Engel

Wenxin Liu, David Caruso, Eddy Ilg, Jing Dong, Anasta- sios I. Mourikis, Kostas Daniilidis, Vijay Kumar, and Jakob Engel. Tlio: Tight learned inertial odometry.IEEE Robotics and Automation Letters, 5(4):5653–5660, 2020. 2, 3, 5, 6

2020
[17]

Smpl: A skinned multi- person linear model

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. Smpl: A skinned multi- person linear model. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2, pages 851–866. 2023. 2

2023
[18]

Aria everyday activ- ities dataset, 2024

Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexan- der Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, Jing Dong, Kiran Somasun- daram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Ju- lian Engel, Xiaqing Pan, and ...

2024
[19]

Karen Liu, Ziwei Liu, Jakob En- gel, Renzo De Nardi, and Richard Newcombe

Lingni Ma, Yuting Ye, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Rowan Postyeni, Luis Pesqueira, Alexan- der Gamino, Vijay Baiyya, Hyo Jin Kim, Kevin Bailey, David Soriano Fosas, C. Karen Liu, Ziwei Liu, Jakob En- gel, Renzo De Nardi, and Richard Newcombe. Nymeria: A massive collection of multimodal egocentric daily motion in the wild, 2024. 1, 2, 4, 5

2024
[20]

Sebastian O. H. Madgwick, Andrew J. L. Harrison, and Ravi Vaidyanathan. Estimation of imu and marg orientation using a gradient descent algorithm. In2011 IEEE International Conference on Rehabilitation Robotics (ICORR), pages 1–7, Zurich, Switzerland, 2011. IEEE. 3

2011
[21]

Nonlinear complementary filters on the special orthogonal group.IEEE Transactions on Automatic Control, 53(5): 1203–1218, 2008

Robert Mahony, Tarek Hamel, and Jean-Michel Pflimlin. Nonlinear complementary filters on the special orthogonal group.IEEE Transactions on Automatic Control, 53(5): 1203–1218, 2008. 3

2008
[22]

Imuposer: Full-body pose estimation using imus in phones, watches, and earbuds

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harri- son, and Karan Ahuja. Imuposer: Full-body pose estimation using imus in phones, watches, and earbuds. InProceedings of the 2023 CHI Conference on Human Factors in Comput- ing Systems, pages 1–12, 2023. 3

2023
[23]

Vins-mono: A robust and versatile monocular visual-inertial state estimator

Tong Qin, Peiliang Li, and Shaojie Shen. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4):1004–1020, 2018. 1

2018
[24]

Airio: Learning inertial odometry with enhanced imu feature observability, 2025

Yuheng Qiu, Can Xu, Yutian Chen, Shibo Zhao, Junyi Geng, and Sebastian Scherer. Airio: Learning inertial odometry with enhanced imu feature observability, 2025. 1, 2, 3, 6

2025
[25]

Magshield: Towards better robustness in sparse inertial motion capture under magnetic disturbances,

Yunzhe Shao, Xinyu Yi, Lu Yin, Shihui Guo, Junhai Yong, and Feng Xu. Magshield: Towards better robustness in sparse inertial motion capture under magnetic disturbances,
[26]

Idol: Iner- tial deep orientation-estimation and localization

Scott Sun, Dennis Melamed, and Kris Kitani. Idol: Iner- tial deep orientation-estimation and localization. InProceed- ings of the AAAI Conference on Artificial Intelligence, pages 6128–6137, 2021. 2

2021
[27]

Diffusionposer: Real-time human motion reconstruction from arbitrary sparse sensors using autoregressive diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse, Scott Delp, and C Karen Liu. Diffusionposer: Real-time human motion reconstruction from arbitrary sparse sensors using autoregressive diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2513–2523, 2024. 3

2024
[28]

Sparse inertial poser: Automatic 3d hu- man pose estimation from sparse imus

Timo V on Marcard, Bodo Rosenhahn, Michael J Black, and Gerard Pons-Moll. Sparse inertial poser: Automatic 3d hu- man pose estimation from sparse imus. InComputer graph- ics forum, pages 349–360. Wiley Online Library, 2017. 3

2017
[29]

Ego4o: Egocentric human motion capture and understanding from multi-modal input, 2025

Jian Wang, Rishabh Dabral, Diogo Luvizon, Zhe Cao, Lingjie Liu, Thabo Beeler, and Christian Theobalt. Ego4o: Egocentric human motion capture and understanding from multi-modal input, 2025. 1

2025
[30]

Xsens IMU Systems.https: //www.xsens.com

Xsens Technologies B.V . Xsens IMU Systems.https: //www.xsens.com. Accessed: 2024-03-07. 3

2024
[31]

Mobileposer: Real-time full-body pose estimation and 3d human translation from imus in mobile consumer de- vices

Vasco Xu, Chenfeng Gao, Henry Hoffmann, and Karan Ahuja. Mobileposer: Real-time full-body pose estimation and 3d human translation from imus in mobile consumer de- vices. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, New York, NY , USA, 2024. Association for Computing Machinery. 3

2024
[32]

Ridi: Robust imu double integration

Hang Yan, Qi Shan, and Yasutaka Furukawa. Ridi: Robust imu double integration. InProceedings of the European con- ference on computer vision (ECCV), pages 621–636, 2018. 2

2018
[33]

Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, and new methods, 2019

Hang Yan, Sachini Herath, and Yasutaka Furukawa. Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, and new methods, 2019. 1, 2, 3, 6

2019
[34]

Tof-ip: time-of-flight enhanced sparse inertial poser for real-time human motion capture

Yuan Yao, Shifan Jiang, Yangqing Hou, Chengxu Zuo, Xin- rui Chen, Shihui Guo, and Yipeng Qin. Tof-ip: time-of-flight enhanced sparse inertial poser for real-time human motion capture. 2025. 3

2025
[35]

Transpose: Real-time 3d human translation and pose estimation with six inertial sensors.ACM Transactions on Graphics (TOG), 40(4):1–13,

Xinyu Yi, Yuxiao Zhou, and Feng Xu. Transpose: Real-time 3d human translation and pose estimation with six inertial sensors.ACM Transactions on Graphics (TOG), 40(4):1–13,
[36]

Phys- ical inertial poser (pip): Physics-aware real-time human mo- tion tracking from sparse inertial sensors

Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, and Feng Xu. Phys- ical inertial poser (pip): Physics-aware real-time human mo- tion tracking from sparse inertial sensors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13167–13178, 2022. 3

2022
[37]

Physical non-inertial poser (pnp): modeling non-inertial effects in sparse-inertial human motion capture

Xinyu Yi, Yuxiao Zhou, and Feng Xu. Physical non-inertial poser (pnp): modeling non-inertial effects in sparse-inertial human motion capture. InACM SIGGRAPH 2024 Confer- ence Papers, pages 1–11, 2024. 3

2024
[38]

Improving global motion estimation in sparse imu-based motion capture with physics.ACM Transactions on Graphics (TOG), 44(4):1–16,

Xinyu Yi, Shaohua Pan, and Feng Xu. Improving global motion estimation in sparse imu-based motion capture with physics.ACM Transactions on Graphics (TOG), 44(4):1–16,
[39]

Baroposer: Real-time human motion tracking from imus and barometers in every- day devices

Libo Zhang, Xinyu Yi, and Feng Xu. Baroposer: Real-time human motion tracking from imus and barometers in every- day devices. InProceedings of the 38th Annual ACM Sympo- sium on User Interface Software and Technology, page 1–9. ACM, 2025. 3

2025
[40]

Tartan imu: A light foundation model for inertial positioning in robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard, Yuheng Qiu, Wenshan Wang, and Sebastian Scherer. Tartan imu: A light foundation model for inertial positioning in robotics. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 22520–22529, 2025. 2

2025
[41]

On the continuity of rotation representations in neural networks, 2020

Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. On the continuity of rotation representations in neural networks, 2020. 3

2020
[42]

Transformer imu calibrator: Dynamic on-body imu calibration for inertial motion capture.ACM Transac- tions on Graphics (TOG), 44(4):1–14, 2025

Chengxu Zuo, Jiawei Huang, Xiao Jiang, Yuan Yao, Xian- gren Shi, Rui Cao, Xinyu Yi, Feng Xu, Shihui Guo, and Yipeng Qin. Transformer imu calibrator: Dynamic on-body imu calibration for inertial motion capture.ACM Transac- tions on Graphics (TOG), 44(4):1–14, 2025. 3 4

2025

[1] [1]

Rio: Rotation-equivariance supervised learning of robust in- ertial odometry

Xiya Cao, Caifa Zhou, Dandan Zeng, and Yongliang Wang. Rio: Rotation-equivariance supervised learning of robust in- ertial odometry. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 6614–6623, 2022. 2

2022

[2] [2]

Ionet: Learning to cure the curse of drift in inertial odometry

Changhao Chen, Xiaoxuan Lu, Andrew Markham, and Niki Trigoni. Ionet: Learning to cure the curse of drift in inertial odometry. InProceedings of the AAAI conference on artifi- cial intelligence, 2018. 2

2018

[3] [3]

Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem, 2017

Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, and Niki Trigoni. Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem, 2017. 1

2017

[4] [4]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, et al. Project aria: A new tool for egocentric multi-modal ai research.arXiv preprint arXiv:2308.13561, 2023. 2, 4

work page internal anchor Pith review Pith/arXiv arXiv 2023

[5] [5]

Visual-inertial navigation: A concise re- view

Guoquan Huang. Visual-inertial navigation: A concise re- view. In2019 international conference on robotics and au- tomation (ICRA), pages 9572–9582. IEEE, 2019. 3

2019

[6] [6]

Deep iner- tial poser: Learning to reconstruct human pose from sparse inertial measurements in real time.ACM Transactions on Graphics (TOG), 37(6):1–15, 2018

Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael J Black, Otmar Hilliges, and Gerard Pons-Moll. Deep iner- tial poser: Learning to reconstruct human pose from sparse inertial measurements in real time.ACM Transactions on Graphics (TOG), 37(6):1–15, 2018. 3

2018

[7] [7]

Eqnio: Subequivariant neural inertial odometry.arXiv preprint arXiv:2408.06321, 2024

Royina Karegoudra Jayanth, Yinshuang Xu, Ziyun Wang, Evangelos Chatzipantazis, Daniel Gehrig, and Kostas Dani- ilidis. Eqnio: Subequivariant neural inertial odometry.arXiv preprint arXiv:2408.06321, 2024. 1, 2, 3, 6

work page arXiv 2024

[8] [8]

Transformer inertial poser: Real-time human motion reconstruction from sparse imus with simultaneous terrain generation

Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W Winkler, and C Karen Liu. Transformer inertial poser: Real-time human motion reconstruction from sparse imus with simultaneous terrain generation. InSIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022. 3

2022

[9] [9]

A. R. Jimenez, F. Seco, C. Prieto, and J. Guevara. A compar- ison of pedestrian dead-reckoning algorithms using a low- cost mems imu. In2009 6th IEEE International Symposium on Intelligent Signal Processing, pages 37–42. IEEE, 2009. 2

2009

[10] [10]

Jim ´enez, F

A.R. Jim ´enez, F. Seco, J.C. Prieto, and J. Guevara. Indoor pedestrian navigation using an ins/ekf framework for yaw drift reduction and a foot-mounted imu. In2010 7th Work- shop on Positioning, Navigation and Communication, pages 135–143, 2010. 2

2010

[11] [11]

Vqf: Highly accurate imu orientation estimation with bias estimation and magnetic dis- turbance rejection.Information Fusion, 91:187–204, 2023

Daniel Laidig and Thomas Seel. Vqf: Highly accurate imu orientation estimation with bias estimation and magnetic dis- turbance rejection.Information Fusion, 91:187–204, 2023. 3

2023

[12] [12]

Lidar odometry survey: recent advancements and re- maining challenges.Intelligent Service Robotics, 17(2):95– 118, 2024

Dongjae Lee, Minwoo Jung, Wooseong Yang, and Ayoung Kim. Lidar odometry survey: recent advancements and re- maining challenges.Intelligent Service Robotics, 17(2):95– 118, 2024. 3

2024

[13] [13]

Mins: Efficient and robust multisensor-aided inertial navigation system, 2023

Woosik Lee, Patrick Geneva, Chuchu Chen, and Guoquan Huang. Mins: Efficient and robust multisensor-aided inertial navigation system, 2023. 3

2023

[14] [14]

Ultraposer: Pushing the limits of imu-based full- body pose estimation with ultrasound sensing on consumer wearables

Yadong Li, Shuning Wang, Yongjian Fu, Justin Chen, 2 Xingyu Chen, Ju Ren, Xinyu Zhang, Akshay Gadre, and Ke Sun. Ultraposer: Pushing the limits of imu-based full- body pose estimation with ultrasound sensing on consumer wearables. InProceedings of the 38th Annual ACM Sym- posium on User Interface Software and Technology, pages 1–15, 2025. 3

2025

[15] [15]

M2eit: Multi-domain mixture of experts for robust neural inertial tracking

Yan Li, Yang Xu, Changhao Chen, Zhongchen Shi, Wei Chen, Liang Xie, Hongbo Chen, and Erwei Yin. M2eit: Multi-domain mixture of experts for robust neural inertial tracking. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 28207– 28216, 2025. 2

2025

[16] [16]

Mourikis, Kostas Daniilidis, Vijay Kumar, and Jakob Engel

Wenxin Liu, David Caruso, Eddy Ilg, Jing Dong, Anasta- sios I. Mourikis, Kostas Daniilidis, Vijay Kumar, and Jakob Engel. Tlio: Tight learned inertial odometry.IEEE Robotics and Automation Letters, 5(4):5653–5660, 2020. 2, 3, 5, 6

2020

[17] [17]

Smpl: A skinned multi- person linear model

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. Smpl: A skinned multi- person linear model. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2, pages 851–866. 2023. 2

2023

[18] [18]

Aria everyday activ- ities dataset, 2024

Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexan- der Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, Jing Dong, Kiran Somasun- daram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Ju- lian Engel, Xiaqing Pan, and ...

2024

[19] [19]

Karen Liu, Ziwei Liu, Jakob En- gel, Renzo De Nardi, and Richard Newcombe

Lingni Ma, Yuting Ye, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Rowan Postyeni, Luis Pesqueira, Alexan- der Gamino, Vijay Baiyya, Hyo Jin Kim, Kevin Bailey, David Soriano Fosas, C. Karen Liu, Ziwei Liu, Jakob En- gel, Renzo De Nardi, and Richard Newcombe. Nymeria: A massive collection of multimodal egocentric daily motion in the wild, 2024. 1, 2, 4, 5

2024

[20] [20]

Sebastian O. H. Madgwick, Andrew J. L. Harrison, and Ravi Vaidyanathan. Estimation of imu and marg orientation using a gradient descent algorithm. In2011 IEEE International Conference on Rehabilitation Robotics (ICORR), pages 1–7, Zurich, Switzerland, 2011. IEEE. 3

2011

[21] [21]

Nonlinear complementary filters on the special orthogonal group.IEEE Transactions on Automatic Control, 53(5): 1203–1218, 2008

Robert Mahony, Tarek Hamel, and Jean-Michel Pflimlin. Nonlinear complementary filters on the special orthogonal group.IEEE Transactions on Automatic Control, 53(5): 1203–1218, 2008. 3

2008

[22] [22]

Imuposer: Full-body pose estimation using imus in phones, watches, and earbuds

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harri- son, and Karan Ahuja. Imuposer: Full-body pose estimation using imus in phones, watches, and earbuds. InProceedings of the 2023 CHI Conference on Human Factors in Comput- ing Systems, pages 1–12, 2023. 3

2023

[23] [23]

Vins-mono: A robust and versatile monocular visual-inertial state estimator

Tong Qin, Peiliang Li, and Shaojie Shen. Vins-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4):1004–1020, 2018. 1

2018

[24] [24]

Airio: Learning inertial odometry with enhanced imu feature observability, 2025

Yuheng Qiu, Can Xu, Yutian Chen, Shibo Zhao, Junyi Geng, and Sebastian Scherer. Airio: Learning inertial odometry with enhanced imu feature observability, 2025. 1, 2, 3, 6

2025

[25] [25]

Magshield: Towards better robustness in sparse inertial motion capture under magnetic disturbances,

Yunzhe Shao, Xinyu Yi, Lu Yin, Shihui Guo, Junhai Yong, and Feng Xu. Magshield: Towards better robustness in sparse inertial motion capture under magnetic disturbances,

[26] [26]

Idol: Iner- tial deep orientation-estimation and localization

Scott Sun, Dennis Melamed, and Kris Kitani. Idol: Iner- tial deep orientation-estimation and localization. InProceed- ings of the AAAI Conference on Artificial Intelligence, pages 6128–6137, 2021. 2

2021

[27] [27]

Diffusionposer: Real-time human motion reconstruction from arbitrary sparse sensors using autoregressive diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse, Scott Delp, and C Karen Liu. Diffusionposer: Real-time human motion reconstruction from arbitrary sparse sensors using autoregressive diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2513–2523, 2024. 3

2024

[28] [28]

Sparse inertial poser: Automatic 3d hu- man pose estimation from sparse imus

Timo V on Marcard, Bodo Rosenhahn, Michael J Black, and Gerard Pons-Moll. Sparse inertial poser: Automatic 3d hu- man pose estimation from sparse imus. InComputer graph- ics forum, pages 349–360. Wiley Online Library, 2017. 3

2017

[29] [29]

Ego4o: Egocentric human motion capture and understanding from multi-modal input, 2025

Jian Wang, Rishabh Dabral, Diogo Luvizon, Zhe Cao, Lingjie Liu, Thabo Beeler, and Christian Theobalt. Ego4o: Egocentric human motion capture and understanding from multi-modal input, 2025. 1

2025

[30] [30]

Xsens IMU Systems.https: //www.xsens.com

Xsens Technologies B.V . Xsens IMU Systems.https: //www.xsens.com. Accessed: 2024-03-07. 3

2024

[31] [31]

Mobileposer: Real-time full-body pose estimation and 3d human translation from imus in mobile consumer de- vices

Vasco Xu, Chenfeng Gao, Henry Hoffmann, and Karan Ahuja. Mobileposer: Real-time full-body pose estimation and 3d human translation from imus in mobile consumer de- vices. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, New York, NY , USA, 2024. Association for Computing Machinery. 3

2024

[32] [32]

Ridi: Robust imu double integration

Hang Yan, Qi Shan, and Yasutaka Furukawa. Ridi: Robust imu double integration. InProceedings of the European con- ference on computer vision (ECCV), pages 621–636, 2018. 2

2018

[33] [33]

Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, and new methods, 2019

Hang Yan, Sachini Herath, and Yasutaka Furukawa. Ronin: Robust neural inertial navigation in the wild: Benchmark, evaluations, and new methods, 2019. 1, 2, 3, 6

2019

[34] [34]

Tof-ip: time-of-flight enhanced sparse inertial poser for real-time human motion capture

Yuan Yao, Shifan Jiang, Yangqing Hou, Chengxu Zuo, Xin- rui Chen, Shihui Guo, and Yipeng Qin. Tof-ip: time-of-flight enhanced sparse inertial poser for real-time human motion capture. 2025. 3

2025

[35] [35]

Transpose: Real-time 3d human translation and pose estimation with six inertial sensors.ACM Transactions on Graphics (TOG), 40(4):1–13,

Xinyu Yi, Yuxiao Zhou, and Feng Xu. Transpose: Real-time 3d human translation and pose estimation with six inertial sensors.ACM Transactions on Graphics (TOG), 40(4):1–13,

[36] [36]

Phys- ical inertial poser (pip): Physics-aware real-time human mo- tion tracking from sparse inertial sensors

Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, and Feng Xu. Phys- ical inertial poser (pip): Physics-aware real-time human mo- tion tracking from sparse inertial sensors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13167–13178, 2022. 3

2022

[37] [37]

Physical non-inertial poser (pnp): modeling non-inertial effects in sparse-inertial human motion capture

Xinyu Yi, Yuxiao Zhou, and Feng Xu. Physical non-inertial poser (pnp): modeling non-inertial effects in sparse-inertial human motion capture. InACM SIGGRAPH 2024 Confer- ence Papers, pages 1–11, 2024. 3

2024

[38] [38]

Improving global motion estimation in sparse imu-based motion capture with physics.ACM Transactions on Graphics (TOG), 44(4):1–16,

Xinyu Yi, Shaohua Pan, and Feng Xu. Improving global motion estimation in sparse imu-based motion capture with physics.ACM Transactions on Graphics (TOG), 44(4):1–16,

[39] [39]

Baroposer: Real-time human motion tracking from imus and barometers in every- day devices

Libo Zhang, Xinyu Yi, and Feng Xu. Baroposer: Real-time human motion tracking from imus and barometers in every- day devices. InProceedings of the 38th Annual ACM Sympo- sium on User Interface Software and Technology, page 1–9. ACM, 2025. 3

2025

[40] [40]

Tartan imu: A light foundation model for inertial positioning in robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard, Yuheng Qiu, Wenshan Wang, and Sebastian Scherer. Tartan imu: A light foundation model for inertial positioning in robotics. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 22520–22529, 2025. 2

2025

[41] [41]

On the continuity of rotation representations in neural networks, 2020

Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. On the continuity of rotation representations in neural networks, 2020. 3

2020

[42] [42]

Transformer imu calibrator: Dynamic on-body imu calibration for inertial motion capture.ACM Transac- tions on Graphics (TOG), 44(4):1–14, 2025

Chengxu Zuo, Jiawei Huang, Xiao Jiang, Yuan Yao, Xian- gren Shi, Rui Cao, Xinyu Yi, Feng Xu, Shihui Guo, and Yipeng Qin. Transformer imu calibrator: Dynamic on-body imu calibration for inertial motion capture.ACM Transac- tions on Graphics (TOG), 44(4):1–14, 2025. 3 4

2025