pith. sign in

arxiv: 2606.19983 · v1 · pith:UTNCSLICnew · submitted 2026-06-18 · 💻 cs.CR

A Measurement Study of Cryptographic Misuse in Embodied AI Mobile Applications

Pith reviewed 2026-06-26 17:03 UTC · model grok-4.3

classification 💻 cs.CR
keywords cryptographic misuseembodied AImobile applicationsmeasurement studycyber-physical securitysecurity trade-offsEAI mobile ecosystem
0
0 comments X

The pith

Embodied AI mobile applications exhibit widespread cryptographic misuse driven by domain-specific engineering constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper performs the first large-scale measurement of cryptographic misuse inside embodied AI mobile applications. It assembles a benchmark of 507 real-world apps spanning six EAI domains and runs an automated semantic-aware pipeline against five major failure modes. The analysis surfaces 12,975 misuse instances at 80.74 percent evaluated precision. The results indicate that the failures arise from concrete EAI engineering pressures such as latency demands in control paths and dependence on offline provisioning plus legacy IoT SDKs, rather than from isolated coding mistakes. These mobile-side flaws create an attack surface that can bypass network protections and allow direct hijacking of physical EAI devices.

Core claim

Through analysis of 507 real-world EAI mobile applications across six domains, the study identifies 12,975 cryptographic misuse findings with 80.74% precision. These failures result from EAI-specific constraints including latency-sensitive control paths that weaken transport protection and heavy use of offline provisioning and legacy IoT SDKs that promote credential hardcoding. Real-world cases demonstrate how such flaws allow interception of command channels and hijacking of EAI device control.

What carries the argument

The EAIAppZoo benchmark of 507 applications paired with an automated semantic-aware analysis pipeline that detects five cryptographic failure modes.

If this is right

  • Latency-sensitive control paths in EAI apps systematically weaken transport-layer protections.
  • Offline device provisioning leads to local hardcoding of authentication credentials.
  • Legacy IoT SDKs increase the incidence of hardcoded credentials inside mobile control apps.
  • Mobile applications form a fragile cryptographic trust boundary in cyber-physical EAI systems.
  • Adversaries can exploit these mobile flaws to bypass nominal network protections and directly control physical EAI entities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Security design for EAI systems may require new patterns that reconcile real-time control requirements with cryptographic needs.
  • Audits of cyber-physical systems should treat mobile control applications as a primary rather than secondary target.
  • Similar measurement methods could expose comparable issues in other mobile-controlled physical domains such as industrial robotics or building automation.
  • The observed trade-offs suggest value in domain-specific cryptographic libraries tuned for offline and low-latency EAI scenarios.

Load-bearing premise

The automated semantic-aware analysis pipeline accurately detects the five major cryptographic failure modes at the reported precision without substantial false positives that would alter the prevalence conclusions.

What would settle it

A manual review of several hundred randomly sampled detections that yields a true precision well below 80.74 percent or shows that the detected failures occur independently of EAI-specific constraints such as latency or offline provisioning.

Figures

Figures reproduced from arXiv: 2606.19983 by Boyang Ma, Junchao Li, Minghui Xu, Qi Wang, Xuelei Wang, Xuelong Dai, Yue Zhang, Yuhang Huang.

Figure 1
Figure 1. Figure 1: Dual-mode control interactions in EAI systems. A typical EAI ecosystem involves three tightly connected entities: the mobile application, the cloud ser￾vice, and the physical embodied de￾vice, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the EAGLE framework. Stage I: Dataset Construction and Input. The measurement pipeline be￾gins with dataset construction, aiming to establish a representative sample of real-world embodied AI mobile applications before any security analysis is per￾formed. Starting from candidate applications collected from mainstream app [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Category distribution in EAIAp￾pZoo. To support the large-scale measure￾ment of cryptographic misuses and answer our research questions, we re￾quire a highly representative, real￾world dataset. To this end, we con￾structed EAIAppZoo, a benchmark dataset containing 507 in-the-wild Android applications. These applications were systemat￾ically collected from official applica￾tion markets (e.g., Google Play) a… view at source ↗
Figure 4
Figure 4. Figure 4: Overall frequency of the five cryptographic misuse categories across the EAIAppZoo. Ecosystem-wide Prevalence. Our measurement identifies 12,975 distinct cryptographic misuses across the 507 ap￾plications. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of cryptographic misuse cat [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Physical hijacking of a Unitree quadruped robot after command injec￾tion. Once the attacker joins the home Wi-Fi using the stolen credentials, they bypass all perimeter firewalls, al￾lowing lateral movement to compro￾mise other smart home devices. More dangerously, the attacker can reuse these hardcoded parameters to forge valid, encrypted kinetic control pay￾loads. As illustrated in [PITH_FULL_IMAGE:figu… view at source ↗
read the original abstract

Embodied AI (EAI) mobile applications are evolving from auxiliary user interfaces into active control-path components, directly linking mobile-side cryptographic security to cyber-physical trust. Despite this shift, existing security research predominantly focuses on embodied AI devices and cloud infrastructures, leaving the mobile control layer largely unexplored as a critical attack surface. To bridge this gap, we present the first large-scale measurement study of cryptographic misuse within the EAI mobile ecosystem. We construct EAIAppZoo, a benchmark of 507 real-world applications across six EAI domains, and employ an automated semantic-aware analysis pipeline to measure the prevalence and characteristics of five major cryptographic failure modes. Our measurement yields 12,975 misuse findings (with an evaluated precision of 80.74\%), revealing that these cryptographic failures are driven by EAI-specific engineering constraints rather than random developer errors. We uncover structural security trade-offs: latency-sensitive control paths systematically weaken transport protection, while the heavy reliance on offline device provisioning and legacy IoT SDKs exacerbates the local hardcoding of authentication credentials. Through real-world case studies, we demonstrate how these mobile-side cryptographic flaws bypass nominal network protections, enabling adversaries to intercept command channels and hijack the physical control of EAI entities. Ultimately, our findings highlight that mobile applications have become a fragile, yet overlooked, cryptographic trust boundary in cyber-physical systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper presents the first large-scale measurement study of cryptographic misuse in Embodied AI (EAI) mobile applications. It constructs the EAIAppZoo benchmark containing 507 real-world apps across six EAI domains and applies an automated semantic-aware analysis pipeline to quantify five major cryptographic failure modes. The study reports 12,975 misuse findings at an evaluated precision of 80.74%, attributes the failures to EAI-specific engineering constraints (latency-sensitive control paths, offline provisioning, legacy IoT SDKs) rather than random errors, and includes case studies showing bypass of network protections leading to physical control hijacking.

Significance. If the pipeline precision and EAI-specific attribution hold, the work is significant as the first focused measurement on the mobile control layer in cyber-physical EAI systems. The scale (507 apps, 12,975 findings) and demonstration of structural trade-offs provide concrete evidence of an overlooked trust boundary. The empirical nature and real-world case studies are strengths, though validation of the detection pipeline is required to support the prevalence and attribution claims.

major comments (1)
  1. [Abstract / methodology] Abstract and methodology description: the central claim of 12,975 findings and EAI-specific drivers rests on the automated semantic-aware pipeline achieving 80.74% precision. No details are provided on pipeline construction, the five failure-mode detection rules, dataset characteristics (e.g., app selection criteria, domain distribution), or the manual review process used to compute precision (sample size, stratification, false-positive analysis). This prevents verification that the reported count and attribution are not inflated by systematic over-flagging of legacy SDK patterns common in EAI.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for greater methodological transparency. We will revise the manuscript to include a substantially expanded methodology section addressing all points raised, enabling verification of the pipeline, findings, and EAI-specific attribution.

read point-by-point responses
  1. Referee: [Abstract / methodology] Abstract and methodology description: the central claim of 12,975 findings and EAI-specific drivers rests on the automated semantic-aware pipeline achieving 80.74% precision. No details are provided on pipeline construction, the five failure-mode detection rules, dataset characteristics (e.g., app selection criteria, domain distribution), or the manual review process used to compute precision (sample size, stratification, false-positive analysis). This prevents verification that the reported count and attribution are not inflated by systematic over-flagging of legacy SDK patterns common in EAI.

    Authors: We agree that the submitted manuscript provides insufficient detail on these elements. In the revised version we will add a new subsection (Section 3.2) that: (1) describes the pipeline construction, including the combination of static analysis with semantic context extraction to identify EAI control paths; (2) enumerates the five failure-mode detection rules with examples, explicitly noting how semantic checks distinguish EAI-specific misuse from generic legacy SDK patterns; (3) details dataset characteristics, including app selection criteria (Google Play search with EAI-related keywords followed by manual confirmation of device-control functionality), the six-domain distribution, and exclusion criteria; and (4) reports the manual validation protocol, including the 200-finding stratified sample (by domain and failure mode), reviewer process, and false-positive breakdown showing that flagged legacy-SDK cases were manually inspected and not over-counted. These additions will directly support the prevalence and attribution claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurement study with no derivations or fitted predictions

full rationale

The paper is a large-scale empirical measurement of cryptographic misuse across 507 apps using an automated semantic-aware pipeline. It reports raw counts (12,975 findings) and an independently evaluated precision (80.74%) from manual review. No equations, parameters, predictions, or derivations exist that could reduce to inputs by construction. No self-citation chains, ansatzes, or uniqueness theorems are invoked as load-bearing steps. The central claims rest on direct pipeline outputs and external evaluation, satisfying the self-contained benchmark for score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical measurement study with no theoretical derivations or new constructs; relies on standard security analysis techniques.

pith-pipeline@v0.9.1-grok · 5792 in / 982 out tokens · 30416 ms · 2026-06-26T17:03:03.887570+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 6 canonical work pages

  1. [1]

    In: Proceedings of the 21st International Conference on Mining Software Repositories

    Alecci, M., Jiménez, P.J.R., Allix, K., Bissyandé, T.F., Klein, J.: Androzoo: A ret- rospective with a glimpse into the future. In: Proceedings of the 21st International Conference on Mining Software Repositories. pp. 389–393 (2024)

  2. [2]

    In: 2022 IEEE Symposium on Security and Privacy (SP)

    Ami, A.S., Cooper, N., Kafle, K., Moran, K., Poshyvanyk, D., Nadkarni, A.: Why crypto-detectors fail: A systematic evaluation of cryptographic misuse detection techniques. In: 2022 IEEE Symposium on Security and Privacy (SP). pp. 614–631. IEEE (2022)

  3. [3]

    IEEE Transactions on Software Engineering40(6), 617–632 (2014)

    Bartel, A., Klein, J., Monperrus, M., Le Traon, Y.: Static analysis for extracting permission checks of a large scale framework: The challenges and solutions for analyzing android. IEEE Transactions on Software Engineering40(6), 617–632 (2014)

  4. [4]

    In: Proceedings of the 28th International Conference on Evaluation and Assessment in Software En- gineering

    Bennett, G., Hall, T., Winter, E., Counsell, S.: Semgrep*: Improving the limited performance of static application security testing (sast) tools. In: Proceedings of the 28th International Conference on Evaluation and Assessment in Software En- gineering. pp. 614–623 (2024)

  5. [5]

    In: 2018 IEEE International Conference on Robotics and Automation (ICRA)

    Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., Downs, L., Ibarz, J., Pastor, P., Konolige, K., Levine, S., Vanhoucke, V.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). pp. 4243–4250 (2018). https://doi.org/10.1...

  6. [6]

    IEEE Transactions on Reliability68(4), 1384–1403 (2019)

    Braga, A., Dahab, R., Antunes, N., Laranjeiro, N., Vieira, M.: Understanding how to use static analysis tools for detecting cryptography misuse in software. IEEE Transactions on Reliability68(4), 1384–1403 (2019). https://doi.org/10.1109/TR. 2019.2937214

  7. [7]

    In: Proceedings of the 8thInternationalConferenceonSecurityofInformationandNetworks.pp.322–325 (2015)

    Buddhdev, B., Bhan, R., Gaur, M.S., Laxmi, V.: Dynadroid: Dynamic binary in- strumentation based app behavior monitoring framework. In: Proceedings of the 8thInternationalConferenceonSecurityofInformationandNetworks.pp.322–325 (2015)

  8. [8]

    Calo, R.: The boundaries of privacy harm. Ind. LJ86, 1131 (2011)

  9. [9]

    Paladyn, Journal of Behavioral Robotics12(1), 160–174 (2020)

    Chatzimichali, A., Harrison, R., Chrysostomou, D.: Toward privacy-sensitive human–robot interaction: Privacy terms and human–data interaction in the per- sonal robot era. Paladyn, Journal of Behavioral Robotics12(1), 160–174 (2020). https://doi.org/10.1515/pjbr-2021-0013

  10. [10]

    In: NDSS (2024)

    Chen, Y., Liu, Y., Wu, K.L., Le, D.V., Chau, S.Y.: Towards precise reporting of cryptographic misuses. In: NDSS (2024)

  11. [11]

    In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security

    Egele, M., Brumley, D., Fratantonio, Y., Kruegel, C.: An empirical study of crypto- graphic misuse in android applications. In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. pp. 73–84 (2013)

  12. [12]

    Com- puters in Human behavior142, 107658 (2023) 18 X

    Esterwood, C., Robert Jr, L.P.: Three strikes and you are out!: The impacts of multiple human–robot trust violations and repairs on robot trustworthiness. Com- puters in Human behavior142, 107658 (2023) 18 X. Wang et al

  13. [13]

    ten deadly sins

    Huang, Y., Li, J., Ma, B., Dai, X., Xu, M., Xu, K., Zhang, Y., Wang, J., Cheng, X.: Beyond model jailbreak: Systematic dissection of the “ten deadly sins” in embodied intelligence. arXiv preprint (2025)

  14. [14]

    Ji, J., Qiu, T., Chen, B., Zhang, B., Lou, H., Wang, K., Duan, Y., He, Z., Vierling, L., Hong, D., Zhou, J., Zhang, Z., Zeng, F., Dai, J., Pan, X., Ng, K.Y., O’Gara, A., Xu, H., Tse, B., Fu, J., McAleer, S., Yang, Y., Wang, Y., Zhu, S.C., Guo, Y., Gao, W.: Ai alignment: A comprehensive survey (2025)

  15. [15]

    Kaushik, R., Arndt, K., Kyrki, V.: Safeapt: Safe simulation-to-real robot learning usingdiversepolicieslearnedinsimulation.IEEERoboticsandAutomationLetters 7(3), 6838–6845 (2022)

  16. [16]

    Current robotics reports1(4), 297–309 (2020)

    Kok, B.C., Soh, H.: Trust in robots: Challenges and opportunities. Current robotics reports1(4), 297–309 (2020)

  17. [17]

    Li, K.: Static and dynamic analysis in cryptographic-api misuse detection of mobile application (2021)

  18. [18]

    In- formation and Software Technology88, 67–95 (2017)

    Li, L., Bissyandé, T.F., Papadakis, M., Rasthofer, S., Bartel, A., Octeau, D., Klein, J., Traon, L.: Static analysis of android apps: A systematic literature review. In- formation and Software Technology88, 67–95 (2017)

  19. [19]

    World Wide Web 21(1), 127–150 (2018)

    Liu, Y., Zuo, C., Zhang, Z., Guo, S., Xu, X.: An automatically vetting mechanism for ssl error-handling vulnerability in android hybrid web apps. World Wide Web 21(1), 127–150 (2018)

  20. [20]

    In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER)

    Mauthe, N., Kargén, U., Shahmehri, N.: A large-scale empirical study of android app decompilation. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER). pp. 400–410. IEEE (2021)

  21. [21]

    https: //cwe.mitre.org/data/definitions/321.html (2026), accessed 15 Mar 2026

    MITRE Corporation: CWE-321: Use of Hard-coded Cryptographic Key. https: //cwe.mitre.org/data/definitions/321.html (2026), accessed 15 Mar 2026

  22. [22]

    Cybersecurity Providing in Information and Telecommunication Systems II 20243826, 206–211 (2024)

    Mykhaylova,O.,Fedynyshyn,T.,Platonenko,A.:Hardcodedcredentialsinandroid apps: Service exposure and category-based vulnerability analysis. Cybersecurity Providing in Information and Telecommunication Systems II 20243826, 206–211 (2024)

  23. [23]

    arXiv preprint (2023)

    Nasr, M., Carlini, N., Hayase, J., Jagielski, M., Cooper, A.F., Ippolito, D., Choquette-Choo, C.A., Wallace, E., Tramer, F., Lee, K.: Scalable extraction of training data from (production) language models. arXiv preprint (2023)

  24. [24]

    IEEE Access8, 106437–106451 (2020)

    Qin, J., Zhang, H., Guo, J., Wang, S., Wen, Q., Shi, Y.: Vulnerability detection on android apps–inspired by case study on vulnerability related with web functions. IEEE Access8, 106437–106451 (2020)

  25. [25]

    Rachum-Twaig, O.: Whose robot is it anyway?: Liability for artificial-intelligence- based robots. U. Ill. L. Rev. p. 1141 (2020)

  26. [26]

    IEEE Robotics and Automation Letters (2026)

    Ravichandran, Z., Robey, A., Kumar, V., Pappas, G.J., Hassani, H.: Safety guardrails for llm-enabled robots. IEEE Robotics and Automation Letters (2026)

  27. [27]

    In: 2025 IEEE International Conference on Robotics and Au- tomation (ICRA)

    Robey, A., Ravichandran, Z., Kumar, V., Hassani, H., Pappas, G.J.: Jailbreaking llm-controlled robots. In: 2025 IEEE International Conference on Robotics and Au- tomation (ICRA). pp. 11948–11956 (2025). https://doi.org/10.1109/ICRA55743. 2025.11128119

  28. [28]

    In: Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security

    Shi, S., Wang, X., Lau, W.C.: Mossot: An automated blackbox tester for single sign-on vulnerabilities in mobile applications. In: Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security. pp. 269–282 (2019)

  29. [29]

    In: 2021 IEEE 45th Annual Comput- ers, Software, and Applications Conference (COMPSAC)

    Singleton, L., Zhao, R., Siy, H., Song, M.: Firebugs: Finding and repairing cryp- tography api misuses in mobile applications. In: 2021 IEEE 45th Annual Comput- ers, Software, and Applications Conference (COMPSAC). pp. 1194–1201 (2021). https://doi.org/10.1109/COMPSAC51774.2021.00165 Cryptographic Misuse in Embodied AI Mobile Apps 19

  30. [30]

    IET Information Security17(4), 582–597 (2023)

    Sun, C., Xu, X., Wu, Y., Zeng, D., Tan, G., Ma, S., Wang, P.: Cryptoeval: Eval- uating the risk of cryptographic misuses in android apps with data-flow analysis. IET Information Security17(4), 582–597 (2023)

  31. [31]

    arXiv preprint (2025)

    Tan, X., Liu, B., Bao, Y., Tian, Q., Gao, Z., Wu, X., Luo, Z., Wang, S., Zhang, Y., Wang, X., et al.: Towards safe and trustworthy embodied ai: foundations, status, and prospects. arXiv preprint (2025)

  32. [32]

    Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences383(2289), 20240142 (Jan 2025)

    Winfield, A.F.T., Swana, M., Ives, J., Hauert, S.: On the ethical governance of swarm robotic systems in the real world. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences383(2289), 20240142 (Jan 2025). https://doi.org/10.1098/rsta.2024.0142

  33. [33]

    cert/ca-cert.pem

    Zhang, H., Zhu, C., Wang, X., Zhou, Z., Yin, C., Li, M., Xue, L., Wang, Y., Hu, S., Liu, A., Guo, P., Zhang, L.Y.: Badrobot: Jailbreaking embodied llms in the physical world. arXiv (2024) A Representative Code Snippets A.1 Case 1: Embedded Client Private Key in mTLS Initialization Listing 1.2 shows the embedded client private key used during gRPC channel ...