pith. sign in

arxiv: 2504.21312 · v2 · submitted 2025-04-30 · 💻 cs.PL

Annotating and Auditing the Safety Properties of Unsafe Rust

Pith reviewed 2026-05-22 18:38 UTC · model grok-4.3

classification 💻 cs.PL
keywords unsafe Rustsafety documentationsafety tagsstatic analysisRust standard libraryAPI soundnessundefined behavior
0
0 comments X

The pith

A taxonomy of safety tags plus structural rules can systematically audit and complete documentation for unsafe Rust APIs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to replace ad hoc safety comments in unsafe Rust code with a standardized set of tags that capture the exact conditions needed to avoid undefined behavior. Because many unsafe APIs depend on struct invariants, the authors supply concrete rules that check whether local safety notes are consistent with the requirements of every called unsafe function. They built a linter that enforces these rules automatically and ran it on the Rust standard library, where it surfaced and helped resolve documentation gaps across 27 APIs while demonstrating that the tags apply to 96.1 percent of public unsafe functions. The same tagging scheme was submitted as a Rust RFC. If the method works, developers gain a reliable way to document and review the safety contract of every unsafe boundary instead of relying on incomplete natural-language notes.

Core claim

The paper claims that formalizing safety requirements as a taxonomy of Safety Tags and enforcing a set of empirical rules for structural consistency between local annotations and callee requirements produces a practical, automated way to audit unsafe API documentation for completeness and soundness.

What carries the argument

The taxonomy of Safety Tags together with the empirical rules for structural consistency, implemented as the safety-tool static linter that checks annotations against callee requirements.

If this is right

  • Documentation for 27 APIs in the Rust standard library was updated using 61 safety tags.
  • Safety tags were found applicable to 96.1 percent of the public unsafe APIs in libstd.
  • The tagging approach was proposed to the Rust community through an official RFC.
  • The method provides an automated way to enforce consistency between safety annotations and the requirements of called unsafe functions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same tag set and rules could be integrated into IDEs to give immediate warnings when unsafe code is added without matching documentation.
  • Wider adoption might lower the rate of undefined-behavior bugs that arise from incorrect assumptions about unsafe APIs in external crates.
  • The approach could be adapted to other languages that expose unsafe or low-level primitives, such as C++ or Zig.
  • Future measurements could track whether crates that adopt the tags show fewer soundness-related issues reported over time.

Load-bearing premise

The derived taxonomy of Safety Tags and the empirical rules for structural consistency are complete enough to cover the safety properties that matter for API soundness in practice.

What would settle it

Running the same taxonomy and rules over a large collection of third-party crates and discovering many safety conditions that cannot be expressed by the existing tags or that violate the structural rules would falsify the claim of practical completeness.

Figures

Figures reproduced from arXiv: 2504.21312 by Hongliang Tian, Hui Xu, Jiping Zhou, Xin Wang, Zihao Rao.

Figure 1
Figure 1. Figure 1: Unsafety propagation graph analysis for Listing [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Workflow of our auditing experiments. Inference Rule 12. Raw Pointer Constructor Elimination Rule Γ ⊢ Struct :: 𝑐𝑖 : unsafe, params(𝑐𝑖 ) ⊇ {raw ptr}, 𝑐𝑖 ⇒ 𝑅𝑆𝑐𝑖 ⊇ 𝑆𝑃𝑟𝑎𝑤 Γ ⊢ 𝑐 𝑗 , params(𝑐 𝑗 ) ⊉ {raw ptr} Γ ⊢ 𝑐 𝑗 ⇒ 𝑉 𝑆𝑐 𝑗 ⊇ 𝑆𝑃𝑟𝑎𝑤 Rule 11 states that if a literal constructor belongs to an enumerated type, it cannot simply be considered as veri￾fying nothing, because users can only construct an object using va… view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of safety property tags in Rust standard library. [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of structural patterns in basic units for each crate of the Rust standard library. [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗
read the original abstract

In Rust, unsafe code is the sole source of potential undefined behaviors. To avoid misuse, Rust developers should clarify the safety properties for each unsafe API. However, the community currently lacks a key standard for safety documentation: existing safety comments in the source code and safety documentation can be ad hoc and incomplete. This paper presents a tag-centric methodology for auditing the consistency and completeness of safety documentation. We first derive a taxonomy of Safety Tags to formalize natural-language requirements. Second, because API soundness frequently relies on struct invariants, we propose a set of empirical rules to systematically audit the structural consistency of safety documentation. We implemented this methodology in safety-tool, a static linter that automatically enforces structural consistency between local safety annotations and callee requirements. Our approach was applied to the Rust standard library, fixing documentation issues on 27 APIs with 61 safety tags and identifying safety tags that are applicable to 96.1% of the public unsafe APIs in libstd. Furthermore, we have formalized the tagging idea through a Rust RFC to the wider community. We believe that the approach establishes a standardized practice of safety documentation and helps significantly reduce safety perils.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a tag-centric methodology for auditing the consistency and completeness of safety documentation for unsafe Rust APIs. It derives a taxonomy of Safety Tags to formalize natural-language safety requirements, introduces empirical rules to audit structural consistency (especially around struct invariants), implements the approach in the safety-tool static linter, applies it to the Rust standard library (fixing documentation on 27 APIs involving 61 safety tags and achieving 96.1% coverage of public unsafe APIs), and submits a Rust RFC to promote standardization.

Significance. If the taxonomy and rules are shown to be sufficiently complete and generalizable, the work could meaningfully advance standardized safety documentation practices in Rust, helping reduce misuse of unsafe APIs and associated undefined-behavior risks. The concrete application to libstd, the reported fixes, and the RFC submission provide evidence of practical utility and community engagement. The empirical grounding and linter implementation are strengths that support reproducibility.

major comments (3)
  1. [§3] §3 (Taxonomy derivation): The Safety Tags taxonomy is presented as derived from analysis of existing code and documentation, yet the manuscript provides no explicit validation against documented soundness failures or independent unsafe codebases outside libstd; this is load-bearing for the central claim that the tags cover the safety properties that matter for API soundness in practice.
  2. [§4] §4 (Empirical rules): The structural-consistency rules are described as empirical and sufficient for auditing callee requirements against local annotations, but the paper does not demonstrate that they capture all relevant dependency patterns or invariants that arise in non-stdlib unsafe code; without such evidence the linter could certify consistent but still incomplete documentation.
  3. [§5] §5 (Evaluation): The 96.1% coverage figure and the 27-API fixes are reported only for libstd; the absence of an external validation set means the reduction in safety perils cannot yet be extrapolated beyond the standard library.
minor comments (2)
  1. The definition of 'structural consistency' should be stated more formally (perhaps with a small example) in the introduction so that readers can immediately understand the auditing target.
  2. Table or figure presenting the full Safety Tags taxonomy would benefit from an additional column showing one concrete unsafe API example per tag.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the scope and limitations of our work. We address each major comment below and describe the revisions we will make to improve the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Taxonomy derivation): The Safety Tags taxonomy is presented as derived from analysis of existing code and documentation, yet the manuscript provides no explicit validation against documented soundness failures or independent unsafe codebases outside libstd; this is load-bearing for the central claim that the tags cover the safety properties that matter for API soundness in practice.

    Authors: The taxonomy was systematically derived by examining all public unsafe APIs and their documentation in libstd, which constitutes the foundational and most widely relied-upon set of unsafe interfaces in Rust. To address the concern, we will add to §3 an explicit mapping of each Safety Tag to the categories of undefined behavior documented in the Rust Reference and Rustonomicon, thereby connecting the tags to known soundness failure modes. While a comprehensive audit of external crates lies beyond the current scope, the submitted RFC is intended to solicit exactly such community-driven validation and extension. revision: partial

  2. Referee: [§4] §4 (Empirical rules): The structural-consistency rules are described as empirical and sufficient for auditing callee requirements against local annotations, but the paper does not demonstrate that they capture all relevant dependency patterns or invariants that arise in non-stdlib unsafe code; without such evidence the linter could certify consistent but still incomplete documentation.

    Authors: We acknowledge that the rules were developed empirically from patterns observed in libstd. In the revised version we will augment §4 with two additional examples drawn from non-stdlib unsafe code (one involving custom struct invariants and one involving trait-object safety requirements) to illustrate how the rules extend to common dependency patterns outside the standard library. We will also clarify that the linter guarantees structural consistency rather than semantic completeness, and we will add an explicit limitations paragraph on this point. revision: yes

  3. Referee: [§5] §5 (Evaluation): The 96.1% coverage figure and the 27-API fixes are reported only for libstd; the absence of an external validation set means the reduction in safety perils cannot yet be extrapolated beyond the standard library.

    Authors: The evaluation deliberately targets libstd because it is the single most critical and standardized body of unsafe APIs. The reported coverage and fixes demonstrate the methodology's immediate practical value. In the revision we will expand §5 with a short additional case study applying the linter to a popular external crate that makes heavy use of unsafe code, providing initial evidence of applicability beyond libstd. We will also note that the RFC submission is the mechanism through which broader empirical validation will occur. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents an empirical methodology: a taxonomy of Safety Tags is derived from analysis of natural-language safety comments and documentation, followed by a set of empirical rules for structural consistency based on struct invariants. These are then encoded into a static linter and applied to libstd, yielding the reported 96.1% coverage and 27 fixes. No step reduces by construction to its inputs via self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. The taxonomy and rules are not claimed as predictions but as outputs of the initial analysis; the coverage metric is a direct measurement on the analyzed corpus rather than an independent forecast. The central claim of establishing standardized practice rests on the methodology's utility rather than any tautological equivalence. This is a standard empirical derivation without circular reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that unsafe code is the only source of undefined behavior and that a finite set of natural-language tags can capture the necessary safety invariants for API soundness.

axioms (1)
  • domain assumption Unsafe code is the sole source of potential undefined behaviors in Rust.
    Stated directly in the opening sentence of the abstract.
invented entities (1)
  • Safety Tags taxonomy no independent evidence
    purpose: To formalize natural-language safety requirements for unsafe APIs.
    New structured labels introduced by the paper to replace ad-hoc comments.

pith-pipeline@v0.9.0 · 5731 in / 1249 out tokens · 47515 ms · 2026-05-22T18:38:42.663179+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    Vytautas Astrauskas, Aurel Bíl`y, Jonáš Fiala, Zachary Grannan, Christoph Matheja, Peter Müller, Federico Poli, and Alexander J Summers. 2022. The prusti project: Formal verification for Rust. In NASA Formal Methods Symposium . Springer, 88–108

  2. [2]

    Vytautas Astrauskas, Christoph Matheja, Federico Poli, Peter Müller, and Alexander J Summers. 2020. How do programmers use unsafe Rust? Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1–27

  3. [3]

    Yechan Bae, Youngsuk Kim, Ammar Askar, Jungwon Lim, and Taesoo Kim. 2021. Rudra: Finding memory safety bugs in Rust at the ecosystem scale. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles . 84–99

  4. [4]

    Rust Book. 2025. Unsafe Rust. https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html [Accessed: 2025-04-28]

  5. [5]

    Marco Campion, Mila Dalla Preda, and Roberto Giacobazzi. 2022. Partial (in) completeness in abstract interpretation: Limiting the imprecision in program analysis. Proceedings of the ACM on Programming Languages 6, POPL (2022), 1–31

  6. [6]

    Edmund M Clarke, William Klieber, Miloš Nováček, and Paolo Zuliani. 2011. Model checking and the state explosion problem. In LASER Summer School on Software Engineering . Springer, 1–30

  7. [7]

    Mohan Cui, Chengjun Chen, Hui Xu, and Yangfan Zhou. 2023. SafeDrop: Detecting memory deallocation bugs of rust programs via static data-flow analysis. ACM Transactions on Software Engineering and Methodology 32, 4 (2023), 1–21

  8. [8]

    Mohan Cui, Shuran Sun, Hui Xu, and Yangfan Zhou. 2024. Is unsafe an Achilles’ Heel? A comprehensive study of safety requirements in unsafe Rust programming. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering . 1–13

  9. [9]

    Ana Nora Evans, Bradford Campbell, and Mary Lou Soffa. 2020. Is Rust used safely by software developers?. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering . 246–257

  10. [10]

    Mikhail R Gadelha, Felipe R Monteiro, Jeremy Morse, Lucas C Cordeiro, Bernd Fischer, and Denis A Nicole. 2018. ESBMC 5.0: An industrial-strength C model checker. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering . 888–891

  11. [11]

    Lennard Gäher, Michael Sammler, Ralf Jung, Robbert Krebbers, and Derek Dreyer. 2024. RefinedRust: A type system for high-assurance verification of Rust programs. Proceedings of the ACM on Programming Languages 8, PLDI (2024), 1115–1139. Manuscript submitted to ACM Annotating and Auditing the Safety Properties of Unsafe Rust 27

  12. [12]

    Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. InProceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering . 54–61

  13. [13]

    Bart Jacobs, Jan Smans, Pieter Philippaerts, Frédéric Vogels, Willem Penninckx, and Frank Piessens. 2011. VeriFast: A powerful, sound, predictable, fast verifier for C and Java. In NASA Formal Methods Symposium . Springer, 41–55

  14. [14]

    Ralf Jung, Hoang-Hai Dang, Jeehoon Kang, and Derek Dreyer. 2019. Stacked borrows: an aliasing model for Rust. Proceedings of the ACM on Programming Languages 4, POPL (2019), 1–32

  15. [15]

    Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: Securing the foundations of the Rust programming language. Proceedings of the ACM on Programming Languages 2, POPL (2017), 1–34

  16. [16]

    Andrea Lattuada, Travis Hance, Jay Bosamiya, Matthias Brun, Chanhee Cho, Hayley LeBlanc, Pranav Srinivasan, Reto Achermann, Tej Chajed, Chris Hawblitzel, et al. 2024. Verus: A practical foundation for systems verification. In Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles. 438–454

  17. [17]

    Andrea Lattuada, Travis Hance, Chanhee Cho, Matthias Brun, Isitha Subasinghe, Yi Zhou, Jon Howell, Bryan Parno, and Chris Hawblitzel. 2023. Verus: Verifying Rust programs using linear ghost types. Proceedings of the ACM on Programming Languages 7, OOPSLA1 (2023), 286–315

  18. [18]

    Nico Lehmann, Adam T Geller, Niki Vazou, and Ranjit Jhala. 2023. Flux: Liquid types for rust. Proceedings of the ACM on Programming Languages 7, PLDI (2023), 1533–1557

  19. [19]

    Hongyu Li, Liwei Guo, Yexuan Yang, Shangguang Wang, and Mengwei Xu. 2024. An empirical study of Rust-for-Linux: The success, dissatisfaction, and compromise. In 2024 USENIX Annual Technical Conference. 425–443

  20. [20]

    Zhuohua Li, Jincheng Wang, Mingshen Sun, and John CS Lui. 2021. MirChecker: detecting bugs in Rust programs via static analysis. In Proceedings of The 2021 ACM SIGSAC Conference on Computer and Communications Security . 2183–2196

  21. [21]

    Yusuke Matsushita, Xavier Denis, Jacques-Henri Jourdan, and Derek Dreyer. 2022. RustHornBelt: a semantic foundation for functional verification of Rust programs with unsafe code. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 841–856

  22. [22]

    Yuke Peng, Hongliang Tian, Jinyi Xian, Shuai Zhou, Shoumeng Yan, and Yinqian Zhang. 2024. Framekernel: A safe and efficient kernel architecture via Rust-based intra-kernel privilege separation. In Proceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems . 31–37

  23. [23]

    Boqin Qin, Yilun Chen, Haopeng Liu, Hua Zhang, Qiaoyan Wen, Linhai Song, and Yiying Zhang. 2024. Understanding and detecting real-world safety issues in Rust. IEEE Transactions on Software Engineering (2024)

  24. [24]

    Boqin Qin, Yilun Chen, Zeming Yu, Linhai Song, and Yiying Zhang. 2020. Understanding memory and thread safety practices and issues in real-world Rust programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation . 763–779

  25. [25]

    Ayushi Sharma, Shashank Sharma, Sai Ritvik Tanksalkar, Santiago Torres-Arias, and Aravind Machiry. 2024. Rust for Embedded Systems: Current State and Open Problems. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security . 2296–2310

  26. [26]

    The Rust Team. 2025. Instrument the Rust standard library with safety contracts. https://rust-lang.github.io/rust-project-goals/2025h1/std- contracts.html [Accessed: 2025-04-10]

  27. [27]

    The Rust Team. 2025. Miri. https://github.com/rust-lang/miri [Accessed: 2025-04-10]

  28. [28]

    The Rust Team. 2025. Module ptr. https://doc.rust-lang.org/nightly/std/ptr/index.html [Accessed: 2025-02-01]

  29. [29]

    The Rust Team. 2025. The Rust core allocation and collections library. https://doc.rust-lang.org/nightly/alloc/index.html [Accessed: 2025-04-28]

  30. [30]

    The Rust Team. 2025. The Rust core library. https://doc.rust-lang.org/nightly/core/index.html [Accessed: 2025-04-28]

  31. [31]

    The Rust Team. 2025. The Rust reference: visibility and privacy. https://doc.rust-lang.org/reference/visibility-and-privacy.html [Accessed: 2025-04-28]

  32. [32]

    The Rust Team. 2025. The Rust standard library. https://doc.rust-lang.org/nightly/std/index.html [Accessed: 2025-04-28]

  33. [33]

    The Rust Team. 2025. std::slice::from_raw_parts. https://doc.rust-lang.org/nightly/std/slice/fn.from_raw_parts.html [Accessed: 2025-04-28]

  34. [34]

    The Rust Team. 2025. Support defining C-compatible variadic functions in Rust. https://github.com/rust-lang/rfcs/pull/2137 [Accessed: 2025-04-28]

  35. [35]

    The Rust Team. 2025. Type layout. https://doc.rust-lang.org/reference/type-layout.html [Accessed: 2025-02-01]

  36. [36]

    The Rust Team. 2025. ub_checks module. https://github.com/rust-lang/rust/blob/master/library/core/src/ub_checks.rs [Accessed: 2025-04-10]

  37. [37]

    The Rust Team. 2025. The verify Rust standard library effort. https://model-checking.github.io/verify-rust-std/intro.html [Accessed: 2025-04-10]

  38. [38]

    The Rust Team. 2025. Working with unsafe. https://doc.rust-lang.org/nomicon/working-with-unsafe.html [Accessed: 2025-04-28]

  39. [39]

    Alexa VanHattum, Daniel Schwartz-Narbonne, Nathan Chong, and Adrian Sampson. 2022. Verifying dynamic trait objects in Rust. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice . 321–330

  40. [40]

    Hui Xu, Zhuangbin Chen, Mingshen Sun, Yangfan Zhou, and Michael R Lyu. 2021. Memory-safety challenge considered solved? An in-depth study with all Rust CVEs. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 1 (2021), 1–25. Manuscript submitted to ACM