pith. sign in

arxiv: 2604.07071 · v1 · submitted 2026-04-08 · 💻 cs.HC · cs.CR

BioMoTouch: Touch-Based Behavioral Authentication via Biometric-Motion Interaction Modeling

Pith reviewed 2026-05-10 17:17 UTC · model grok-4.3

classification 💻 cs.HC cs.CR
keywords touch-based authenticationmulti-modal fusioncapacitive sensinginertial sensingmobile biometricsbehavioral authenticationphysiological signalsattack resistance
0
0 comments X

The pith

Integrating capacitive signals of finger contact with inertial measurements of motion creates a unified touch authentication model that reaches 99.71 percent balanced accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Touch interactions carry two distinct signals at once: the capacitive screen records the shape and structure of the finger in contact, while the phone's motion sensors record how the user moves the device during the same action. BioMoTouch learns the specific coordinated pattern between these signals rather than treating them as separate inputs. This joint representation forms a user signature that is harder to replicate because an attacker must match both the physiological contact and the behavioral dynamics simultaneously. The system runs during ordinary tapping and swiping on standard phones with no added hardware or user steps. Evaluation on 38 participants under realistic conditions reports 99.71 percent balanced accuracy, 0.27 percent equal error rate, and false acceptance below 0.90 percent even when attackers attempt artificial replication, mimicry, or puppet-style attacks.

Core claim

During touch interaction, inertial sensors capture user-specific behavioral dynamics, while capacitive screens simultaneously capture physiological characteristics related to finger morphology and skeletal structure. BioMoTouch jointly models physiological contact structures and behavioral motion dynamics by integrating capacitive touchscreen signals with inertial measurements. Rather than combining independent decisions, the framework explicitly learns their coordinated interaction to form a unified representation of touch behavior.

What carries the argument

The explicit learning of coordinated interaction between capacitive physiological contact structures and inertial behavioral motion dynamics to produce a single unified user representation.

Load-bearing premise

The coordinated interaction between capacitive physiological signals and inertial behavioral dynamics produces stable user-specific representations that generalize beyond the 38 participants and resist attacks not included in the evaluation.

What would settle it

A larger study with hundreds of users across varied devices and environments that records substantially higher error rates or successful advanced attacks would show the representations are not as stable or robust as claimed.

Figures

Figures reproduced from arXiv: 2604.07071 by Hongda Zhai, Hongwei Li, Jianbang Chen, Jun Feng, Man Zhou, Qian Wang, Qi Li, Zhengxiong Li, Zijian Ling.

Figure 1
Figure 1. Figure 1: The workflow of BioMoTouch. IV. METHOD A. System Overview Our objective is to design an implicit, hardware-free touch authentication system that operates transparently during natu￾ral user interactions while remaining robust against advanced attack scenarios, including mimicry, artificial replication, and puppet attacks. Achieving this goal requires addressing two key challenges: (i) modeling the multi-dim… view at source ↗
Figure 3
Figure 3. Figure 3: Visualized feature space of raw and augmented (aug.) [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the data collection process. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example fingerprint images acquired by the fingerprint [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: ROC curves of the IMU-based method. 0.0 0.1 0.2 0.3 0.4 False Positive Rate 0.6 0.7 0.8 0.9 1.0 True Positive Rate OCSVM (EER=1.00%) LOF (EER=2.93%) IF (EER=4.18%) [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 10
Figure 10. Figure 10: Decision score distributions of the IMU-based method. 0.0 0.2 0.4 0.6 0.8 1.0 decision score 0 5 10 15 20 density Legitimate Illegitimate [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 14
Figure 14. Figure 14: EERs of different one-class classifiers across fingers. [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: EERs under different user pos￾tures. Dry Finger Wet Finger 0 2 4 6 8 EER (%) Our Method Capacitive-Based [PITH_FULL_IMAGE:figures/full_fig_p012_15.png] view at source ↗
read the original abstract

Touch-based authentication is widely deployed on mobile devices due to its convenience and seamless user experience. However, existing systems largely model touch interaction as a purely behavioral signal, overlooking its intrinsic multidimensional nature and limiting robustness against sophisticated adversarial behaviors and real-world variations. In this work, we present BioMoTouch, a multi-modal touch authentication framework on mobile devices grounded in a key empirical finding: during touch interaction, inertial sensors capture user-specific behavioral dynamics, while capacitive screens simultaneously capture physiological characteristics related to finger morphology and skeletal structure. Building upon this insight, BioMoTouch jointly models physiological contact structures and behavioral motion dynamics by integrating capacitive touchscreen signals with inertial measurements. Rather than combining independent decisions, the framework explicitly learns their coordinated interaction to form a unified representation of touch behavior. BioMoTouch operates implicitly during natural user interactions and requires no additional hardware, enabling practical deployment on commodity mobile devices. We evaluate BioMoTouch with 38 participants under realistic usage conditions. Experimental results show that BioMoTouch achieves a balanced accuracy of 99.71% and an equal error rate of 0.27%. Moreover, it maintains false acceptance rates below 0.90% under artificial replication, mimicry, and puppet attack scenarios, demonstrating strong robustness against partial-factor manipulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces BioMoTouch, a multi-modal touch authentication system for mobile devices that jointly models physiological contact structures from capacitive sensors and behavioral motion dynamics from inertial sensors. Rather than fusing independent decisions, it learns a unified representation of their coordinated interaction during natural touch interactions. Evaluated on 38 participants under realistic conditions, the system reports 99.71% balanced accuracy, 0.27% equal error rate, and false acceptance rates below 0.90% against artificial replication, mimicry, and puppet attacks, claiming strong robustness without additional hardware.

Significance. If the empirical results hold after addressing validation details, the work would be significant for mobile HCI and security by demonstrating that the intrinsic physiological-behavioral coupling in touch can yield stable, user-specific representations resistant to partial-factor attacks. The approach of learning joint interaction rather than post-hoc fusion, combined with evaluation on held-out user data, provides a concrete step beyond purely behavioral models. The small cohort and simulated attacks limit immediate claims to broad generalization, but the core insight has clear potential impact if replicated.

major comments (3)
  1. [Evaluation section] Evaluation section: The manuscript reports strong performance metrics (99.71% balanced accuracy, 0.27% EER) but provides no details on data partitioning (e.g., per-user train/test splits, session-based cross-validation), model training procedures (hyperparameters, optimization, regularization), or statistical tests (confidence intervals, significance testing). These omissions are load-bearing for assessing whether the results reflect genuine generalization or artifacts from small per-user samples.
  2. [Attack scenarios subsection] Attack scenarios subsection: The puppet attack description does not specify whether the simulations involve coordinated, simultaneous spoofing of both capacitive morphology and inertial trajectories to target the joint learned representation. Without explicit protocols for multi-modal forgery, the reported FAR < 0.90% does not yet establish robustness against adaptive adversaries optimizing against the unified model.
  3. [Participant cohort and generalization] Participant cohort and generalization: With N=38, the study has limited power to detect inter-user variability or demographic effects. High within-cohort accuracy could arise from overfitting or session-specific artifacts, particularly if models are trained on limited samples per user; this directly undermines the claim of stable, user-specific representations that generalize beyond the tested group.
minor comments (2)
  1. [Abstract and threat model] The term 'partial-factor manipulation' in the abstract and results is used without a precise definition tying it to the specific attack types; clarifying this in the threat model section would improve readability.
  2. [Method diagrams] Figure captions and method diagrams could more explicitly label the fusion point where coordinated interaction is learned, to distinguish it from simple concatenation or late fusion baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review. We address each major comment below with point-by-point responses. Revisions have been made to add missing methodological details and clarify attack protocols; the cohort size limitation is acknowledged and discussed as a constraint of the current study design.

read point-by-point responses
  1. Referee: [Evaluation section] The manuscript reports strong performance metrics (99.71% balanced accuracy, 0.27% EER) but provides no details on data partitioning (e.g., per-user train/test splits, session-based cross-validation), model training procedures (hyperparameters, optimization, regularization), or statistical tests (confidence intervals, significance testing). These omissions are load-bearing for assessing whether the results reflect genuine generalization or artifacts from small per-user samples.

    Authors: We agree these details are essential for reproducibility and validating generalization. The revised manuscript expands the Evaluation section with explicit descriptions of per-user train/test splits and session-based cross-validation. We now include the full set of hyperparameters, optimization algorithm, regularization techniques, and report confidence intervals plus statistical significance tests for all metrics to confirm results are not artifacts of small samples. revision: yes

  2. Referee: [Attack scenarios subsection] The puppet attack description does not specify whether the simulations involve coordinated, simultaneous spoofing of both capacitive morphology and inertial trajectories to target the joint learned representation. Without explicit protocols for multi-modal forgery, the reported FAR < 0.90% does not yet establish robustness against adaptive adversaries optimizing against the unified model.

    Authors: We have revised the Attack scenarios subsection to explicitly state that puppet attacks simulated coordinated, simultaneous spoofing of both capacitive contact morphology and inertial trajectories targeting the joint representation. Detailed protocols for the multi-modal forgery process are now provided, supporting that the reported FAR reflects robustness against such adaptive attacks. revision: yes

  3. Referee: [Participant cohort and generalization] With N=38, the study has limited power to detect inter-user variability or demographic effects. High within-cohort accuracy could arise from overfitting or session-specific artifacts, particularly if models are trained on limited samples per user; this directly undermines the claim of stable, user-specific representations that generalize beyond the tested group.

    Authors: We acknowledge the genuine limitation of N=38 for detecting demographic effects or ensuring broad generalization. The revised manuscript adds an explicit limitations discussion noting the risk of overfitting or session artifacts and stresses that evaluation used held-out user data. We agree larger, more diverse cohorts are needed for stronger generalization claims and have framed the current results accordingly. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical ML framework is self-contained

full rationale

The BioMoTouch paper presents a data-driven multi-modal authentication system that learns a joint representation of capacitive physiological signals and inertial behavioral dynamics from raw sensor inputs. Performance metrics (99.71% balanced accuracy, 0.27% EER, FAR <0.90% on attacks) are obtained via evaluation on held-out data from 38 participants under realistic conditions, using standard training and testing protocols rather than any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations, ansatzes, or uniqueness theorems reduce the claimed results to the inputs by construction. The derivation chain relies on empirical observation and machine learning, remaining independent of the target authentication outcomes.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Abstract-only review limits visibility into exact model details; the central claim rests on two domain assumptions about sensor capabilities and standard learned parameters in the fusion model.

free parameters (1)
  • model fusion and interaction learning parameters
    The framework learns coordinated interaction between modalities, implying data-fitted parameters whose exact count and values are not specified.
axioms (2)
  • domain assumption Inertial sensors capture user-specific behavioral dynamics during touch interaction
    Presented as a key empirical finding grounding the behavioral component.
  • domain assumption Capacitive screens capture physiological characteristics related to finger morphology and skeletal structure
    Presented as the complementary physiological signal source.

pith-pipeline@v0.9.0 · 5545 in / 1443 out tokens · 62803 ms · 2026-05-10T17:17:06.469396+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    Challenges and opportunities of biometric user authentication in the age of iot: A survey,

    C.-W. Lien and S. Vhaduri, “Challenges and opportunities of biometric user authentication in the age of iot: A survey,”ACM Computing Surveys, vol. 56, no. 1, pp. 1–37, 2023

  2. [2]

    Touch ID Security Overview,

    Apple Inc., “Touch ID Security Overview,” https://support.apple.com/ en-us/HT204587, 2023

  3. [3]

    Face id security,

    ——, “Face id security,” https://support.apple.com/en-us/102381, 2024

  4. [4]

    Ultrasonic Unlock: The Innovation Behind Sam- sung’s In-Display Fingerprint ID,

    Samsung Electronics, “Ultrasonic Unlock: The Innovation Behind Sam- sung’s In-Display Fingerprint ID,” https://insights.samsung.com, 2019

  5. [5]

    Finger recovery transformer: Toward better incomplete fingerprint identification,

    Z. Jia, C. Huang, Z. Wang, H. Fei, S. Wu, and J. Feng, “Finger recovery transformer: Toward better incomplete fingerprint identification,”IEEE Transactions on Information Forensics and Security, vol. 19, pp. 8860– 8874, 2024

  6. [6]

    Age-invariant face recognition by multi-feature fusion and decomposition with self-attention,

    C. Yan, L. Meng, L. Li, J. Zhang, Z. Wang, J. Yin, J. Zhang, Y . Sun, and B. Zheng, “Age-invariant face recognition by multi-feature fusion and decomposition with self-attention,”ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 1s, pp. 1– 18, 2022

  7. [7]

    Data leak exposes unchangeable biometric data of over 1 million people,

    MIT Technology Review, “Data leak exposes unchangeable biometric data of over 1 million people,” https://www.technologyreview.com/2019/ 08/14/133723/, Aug. 2019

  8. [8]

    Stealing your fingerprint via the finger friction sound,

    M. Zhou, L. Wang, Y . Sun, S. Su, X. Ma, Q. Li, and Q. Wang, “Stealing your fingerprint via the finger friction sound,”IEEE Transactions on Networking, vol. 34, pp. 276–291, 2026

  9. [9]

    Realistic fingerprint presentation attacks based on an adversarial approach,

    R. Casula, G. Orr `u, S. Marrone, U. Gagliardini, G. L. Marcialis, and C. Sansone, “Realistic fingerprint presentation attacks based on an adversarial approach,”IEEE Transactions on Information Forensics and Security, vol. 19, pp. 863–877, 2024

  10. [10]

    Sibling-attack: Rethinking transferable adversarial attacks against face recognition,

    Z. Li, B. Yin, T. Yao, J. Guo, S. Ding, S. Chen, and C. Liu, “Sibling-attack: Rethinking transferable adversarial attacks against face recognition,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 24 626–24 637

  11. [11]

    FVFSNet: Frequency-spatial coupling network for finger vein authentication,

    J. Huang, A. Zheng, M. S. Shakeel, W. Yang, and W. Kang, “FVFSNet: Frequency-spatial coupling network for finger vein authentication,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 1322– 1334, 2023

  12. [12]

    Palm vein template protection scheme for resisting similarity attacks,

    Y . Li, W. Wu, Y . Zhang, and C. Li, “Palm vein template protection scheme for resisting similarity attacks,”Computers & Security, vol. 150, p. 104227, 2025. 13

  13. [13]

    Perfusion assessment via local remote photoplethysmog- raphy (rppg),

    B. Kossack, E. Wisotzky, P. Eisert, S. P. Schraven, B. Globke, and A. Hilsmann, “Perfusion assessment via local remote photoplethysmog- raphy (rppg),” inthe IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 2192–2201

  14. [14]

    Caphandauth: Robust and anti-spoofing hand authentication via cots capacitive touchscreens,

    X. Zhu, M. Zhou, X. Qiao, Z. Ling, Q. Liu, H. Wang, X. Ma, and Z. Li, “Caphandauth: Robust and anti-spoofing hand authentication via cots capacitive touchscreens,” inthe 23rd ACM Conference on Embedded Networked Sensor Systems (SenSys), 2025, pp. 560–573

  15. [15]

    PCR-Auth: Solving authentication puzzle challenge with encoded palm contact response,

    L. Huang and C. Wang, “PCR-Auth: Solving authentication puzzle challenge with encoded palm contact response,” inthe IEEE Symposium on Security and Privacy (S&P), 2022, pp. 1034–1048

  16. [16]

    Liveness is not enough: Enhancing fingerprint authentication with behavioral biometrics to de- feat puppet attacks,

    C. Wu, K. He, J. Chen, Z. Zhao, and R. Du, “Liveness is not enough: Enhancing fingerprint authentication with behavioral biometrics to de- feat puppet attacks,” in29th USENIX Security Symposium (USENIX Security), 2020, pp. 2219–2236

  17. [17]

    Touch technology in affective human–, robot–, and virtual–human interactions: A survey,

    T. Olugbade, L. He, P. Maiolino, D. Heylen, and N. Bianchi-Berthouze, “Touch technology in affective human–, robot–, and virtual–human interactions: A survey,”Proceedings of the IEEE, vol. 111, no. 10, pp. 1333–1354, 2023

  18. [18]

    It’s all in the touch: Authenticating users with HOST gestures on multi- touch screen devices,

    C. Wu, H. Cao, G. Xu, C. Zhou, J. Sun, R. Yan, Y . Liu, and H. Jiang, “It’s all in the touch: Authenticating users with HOST gestures on multi- touch screen devices,”IEEE Transactions on Mobile Computing, vol. 23, no. 10, pp. 10 016–10 030, 2024

  19. [19]

    Two-factor authentication for keyless entry system via finger-induced vibrations,

    H. Jiang, P. Ji, T. Zhang, H. Cao, and D. Liu, “Two-factor authentication for keyless entry system via finger-induced vibrations,”IEEE Transac- tions on Mobile Computing, vol. 23, no. 10, pp. 9708–9720, 2024

  20. [20]

    Presspin: Enabling secure pin authentication on mobile devices via structure-borne sounds,

    M. Zhou, Q. Wang, X. Lin, Y . Zhao, P. Jiang, Q. Li, C. Shen, and C. Wang, “Presspin: Enabling secure pin authentication on mobile devices via structure-borne sounds,”IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 2, pp. 1228–1242, 2023

  21. [21]

    TouchPass: Toward behavior-irrelevant on-touch user authentication on smartphones leveraging vibrations,

    X. Xu, J. Yu, Y . Chen, Q. Hua, Y . Zhu, Y .-C. Chen, and M. Li, “TouchPass: Toward behavior-irrelevant on-touch user authentication on smartphones leveraging vibrations,” inthe 26th ACM Annual Interna- tional Conference on Mobile Computing and Networking (MobiCom), 2020, pp. 1–13

  22. [22]

    Fingerslid: Towards finger-sliding contin- uous authentication on smart devices via vibration,

    Y . Xie, F. Li, and Y . Wang, “Fingerslid: Towards finger-sliding contin- uous authentication on smart devices via vibration,”IEEE Transactions on Mobile Computing, vol. 23, no. 5, pp. 6045–6059, 2024

  23. [23]

    Touch well before use: Intuitive and secure authentication for iot devices,

    X. Li, F. Yan, F. Zuo, Q. Zeng, and L. Luo, “Touch well before use: Intuitive and secure authentication for iot devices,” inthe 25th Annual International Conference on Mobile Computing and Networking (MobiCom), 2019, pp. 1–17

  24. [24]

    Performance analysis of multi-motion sensor behavior for active smartphone authen- tication,

    C. Shen, Y . Li, Y . Chen, X. Guan, and R. A. Maxion, “Performance analysis of multi-motion sensor behavior for active smartphone authen- tication,”IEEE Transactions on Information Forensics and Security, vol. 13, no. 1, pp. 48–62, 2018

  25. [25]

    Mmauth: A continuous authen- tication framework on smartphones using multiple modalities,

    Z. Shen, S. Li, X. Zhao, and J. Zou, “Mmauth: A continuous authen- tication framework on smartphones using multiple modalities,”IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1450– 1465, 2022

  26. [26]

    Multi-touch authentication using hand geometry and behavioral information,

    Y . Song, Z. Cai, and Z.-L. Zhang, “Multi-touch authentication using hand geometry and behavioral information,” inIEEE Symposium on Security and Privacy (S&P), 2017, pp. 357–372

  27. [27]

    Listen to your fingers: User authentication based on geometry biometrics of touch gestures,

    H. Chen, F. Li, W. Du, S. Yang, M. Conn, and Y . Wang, “Listen to your fingers: User authentication based on geometry biometrics of touch gestures,”in the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 4, no. 3, 2020

  28. [28]

    Touchscreens can reveal user identity: Capacitive plethysmogram-based biometrics,

    J. Wu, X. Ji, Y . Lyu, X. Luo, Y . Meng, E. Morales, D. Wang, and X. Luo, “Touchscreens can reveal user identity: Capacitive plethysmogram-based biometrics,”IEEE Transactions on Mobile Computing, vol. 23, no. 1, pp. 895–908, 2024

  29. [29]

    Tinyvit: Fast pretraining distillation for small vision transformers,

    K. Wu, J. Zhang, H. Peng, M. Liu, B. Xiao, J. Fu, and L. Yuan, “Tinyvit: Fast pretraining distillation for small vision transformers,” in the European Conference on Computer Vision (ECCV), 2022, pp. 68–85

  30. [30]

    Capacitive images,

    H. Le, “Capacitive images,” http://huyle.de/blog/capacitive-images/, 2019

  31. [31]

    Qualcomm: Enabling a Connected World,

    Qualcomm Incorporated, “Qualcomm: Enabling a Connected World,” https://www.qualcomm.com, 2024

  32. [32]

    Behavioral Biometrics for Continuous Authentication,

    BehavioSec, “Behavioral Biometrics for Continuous Authentication,” https://www.behaviosec.com, 2022

  33. [33]

    BioCatch: Behavioral Biometrics for Fraud Prevention,

    BioCatch, “BioCatch: Behavioral Biometrics for Fraud Prevention,” https://www.biocatch.com/, 2023

  34. [34]

    Live20R Fingerprint Reader,

    ZKTeco, “Live20R Fingerprint Reader,” https://www.zksps.com/ productinfo/46887.html, 2023

  35. [35]

    Fingervib: Fortifying acoustic-based authentication with finger vibration biometrics on smartphones,

    Y . Wu, S. Bai, R. Lv, X. Gong, X. Liu, L. Ding, and Y . Chen, “Fingervib: Fortifying acoustic-based authentication with finger vibration biometrics on smartphones,”IEEE Transactions on Information Forensics and Security, vol. 20, pp. 8936–8950, 2025

  36. [36]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inthe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778

  37. [37]

    Efficientnet: Rethinking model scaling for con- volutional neural networks,

    M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for con- volutional neural networks,” inthe 36th International Conference on Machine Learning (ICML), 2019, pp. 6105–6114

  38. [38]

    Vibwrite: Towards finger- input authentication on ubiquitous surfaces via physical vibration,

    J. Liu, C. Wang, Y . Chen, and N. Saxena, “Vibwrite: Towards finger- input authentication on ubiquitous surfaces via physical vibration,” inthe ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017, p. 73–87