pith. sign in

arxiv: 2605.13246 · v2 · pith:IGUTFW24new · submitted 2026-05-13 · 💻 cs.CR · cs.SE

Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers

Pith reviewed 2026-05-19 17:47 UTC · model grok-4.3

classification 💻 cs.CR cs.SE
keywords reference countingLinux kernelbug detectionstatic analysisassertion checkingprogram slicingsecuritydevice drivers
0
0 comments X

The pith

DrvHorn detects reference counting bugs in Linux kernel drivers by reducing verification to assertion checking and finds 545 issues including 424 unknown ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Reference counting errors in Linux kernel drivers can lead to resource leaks and security problems. DrvHorn is introduced as a tool that models the driver interface to convert the detection of these bugs into a standard assertion checking problem. It uses efficient kernel modeling and program slicing to analyze the entire set of platform drivers in the v6.6 kernel. This resulted in identifying 545 bugs, with 424 not previously known, and a false positive rate of 29.9 percent. The authors submitted patches for the new bugs and had 45 accepted into the kernel.

Core claim

The paper claims that by leveraging the Linux driver interface to reduce reference counting verification to an assertion checking problem, combined with aggressive program slicing, DrvHorn can automatically find reference counting bugs in kernel drivers. When applied to all platform drivers in version 6.6 of the Linux kernel, the tool located 545 such bugs of which 424 were previously unknown, achieving a lower false positive rate of 29.9% than prior work. Patches were submitted to fix the root causes of the new bugs, resulting in 45 merges.

What carries the argument

Reduction of reference counting verification to an assertion checking problem using the Linux driver interface, enabled by kernel modeling and program slicing.

If this is right

  • The approach identifies a large number of real bugs that can be patched to improve kernel stability.
  • Lower false positive rates make the tool more practical for developers to use on large codebases.
  • Submitting and merging patches demonstrates the tool's utility in real-world kernel maintenance.
  • Similar modeling techniques could be applied to detect other classes of bugs in operating system code.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Integrating DrvHorn-like tools into continuous integration for kernel development might catch bugs earlier.
  • Reference counting issues may be more widespread in driver code than previously detected by manual or less efficient methods.
  • Extending the method to other kernel subsystems beyond platform drivers could reveal additional vulnerabilities.

Load-bearing premise

The modeling of the Linux kernel and the reduction of reference counting verification to an assertion checking problem accurately captures real driver behavior without introducing significant false negatives or inaccuracies in the analysis.

What would settle it

Manually auditing a random sample of 50 reported bugs to measure the true false positive rate and comparing it to 29.9%, or checking detection on a driver with a known reference counting bug that was fixed.

Figures

Figures reproduced from arXiv: 2605.13246 by Joe Hattori, Ken Sakayori, Naoki Kobayashi.

Figure 1
Figure 1. Figure 1: Commit e386209 3 Proposed Method [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of DrvHorn 3.1 Reduction to Assertion Checking Problem This section describes how we reduce refcount verification to an assertion checking problem by exploiting the structure of the Linux kernel driver interface. In general, static refcount bug detection in the Linux kernel tends to lack scalability, accuracy, or both due to its vastness and complexity. However, refcount invariants in … view at source ↗
Figure 3
Figure 3. Figure 3: Abstract overview of the verification target. The assertion checking problem is expressed in LLVM bitcode ( [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Abstract implementation of usb_find_interface(). In this example, bus_find_device() returns a refcount-incremented struct device embedded in a struct usb_interface, and the driver recovers the enclosing interface structure via container_of. Device Resource Management Devres provides a device-managed layer of cleanup. __devm_add_action() is the API that registers a devres entry by asso￾ciating a cleanup fun… view at source ↗
Figure 5
Figure 5. Figure 5: Simplified version of tpm_bios_measurements_open() before (left) and after (right) slicing. 3.4 CHC-based Verification As described in Section 3.1, it suffices to solve the assertion checking problem to verify the refcount invariant. Arbitrary off-the-shelf solvers can be used to solve this problem if they can accept LLVM bitcode as the input. We adopted SeaHorn [7] as our backend solver since it is suited… view at source ↗
Figure 6
Figure 6. Figure 6: Commit 5d8e297 tpm_bios_measurements_open() is registered as the open operation for tpm_bios_measurements_ops. It calls get_device() to increment the refcount of chip->dev, but the original code did not decrement it in the error path, leading to a memory leak. The patch adds put_device() in the error branch to balance the refcount. A.2 Commit 60e7c43e Related to Section 4.2 (Kernel Bug Fixes and Contributi… view at source ↗
Figure 7
Figure 7. Figure 7: Commit 60e7c43e node. This can lead to use-after-free bugs. The patch replaces the loop with for_each_child_of_node() and of_get_child_by_name(), and removes the extra of_node_put(). A.4 Commit 54a8cd0 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example removes a stored device link pointer that should only be used for a null check. With the DL_FLAG_AUTOREMOVE_SUPPLIER flag, the device… view at source ↗
Figure 8
Figure 8. Figure 8: Commit b9784e5cd struct qcom_ice { ... - struct device_link *link; }; struct qcom_ice *of_qcom_ice_get(struct device *dev) { struct qcom_ice *ice; + struct device_link *link; ... - ice->link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); - if (!ice->link) + link = device_link_add(dev, ..., DL_FLAG_AUTOREMOVE_SUPPLIER); + if (!link) ... } [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Commit 54a8cd0 pointer in a field struct device_link *link; of struct qcom_ice, which was unnecessary. The patch removes the field and uses a local variable solely for the null check [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Implementation of acer_platform_probe. C Paper-Code Mapping This section briefly explains where to look in the codebase. As already men￾tioned, DrvHorn is availabe at github.com/joehattori/drvhorn. DrvHorn is developed as a fork of SeaHorn, and most of the modifications are made in the directory lib/Transforms/Kernel/. The following table gives a mapping describing in which file the contents of the paper … view at source ↗
read the original abstract

Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces DrvHorn, a tool that models the Linux kernel driver interface and applies aggressive program slicing to reduce reference-counting bug detection to an assertion-checking problem. Applied to all platform drivers in Linux v6.6, the tool reports 545 bugs (424 previously unknown) at a 29.9% false-positive rate, with 45 patches submitted and merged.

Significance. If the kernel model and slicing are faithful, the work would be a meaningful advance in automated detection of a high-impact class of kernel bugs, evidenced by the volume of new bugs found, the lower FP rate relative to prior studies, and the concrete outcome of merged patches. The approach of reducing refcount verification to assertions is a clean technical idea that could be reusable.

major comments (2)
  1. [abstract and §3] The central claim that DrvHorn discovered 424 new bugs rests on the fidelity of the kernel model and slicing (abstract and §3). No independent recall evaluation against a ground-truth set of known reference-counting bugs is reported, leaving open the possibility that omitted paths, concurrent accesses, or driver-specific entry points produce unmeasured false negatives.
  2. [abstract] Validation of the 29.9% false-positive rate and confirmation of the 424 new bugs is described only at a high level (abstract). The manuscript does not detail the manual or automated process used to classify reports as true positives versus false positives, which is load-bearing for the comparative claim against prior studies.
minor comments (2)
  1. [evaluation] The abstract states results for 'all platform drivers' but the precise scope of the analyzed codebase (e.g., which drivers or subsystems were excluded by slicing) should be stated explicitly in the evaluation section.
  2. [§3.1] Notation for the modeled reference-counting APIs and the assertion templates could be clarified with a small table or example in §3.1 to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the potential impact of DrvHorn and for the detailed comments. We address each of the major comments below and outline the revisions we plan to make.

read point-by-point responses
  1. Referee: [abstract and §3] The central claim that DrvHorn discovered 424 new bugs rests on the fidelity of the kernel model and slicing (abstract and §3). No independent recall evaluation against a ground-truth set of known reference-counting bugs is reported, leaving open the possibility that omitted paths, concurrent accesses, or driver-specific entry points produce unmeasured false negatives.

    Authors: We acknowledge that the absence of a recall evaluation against a complete ground-truth set is a limitation of our evaluation. Constructing such a ground truth is inherently difficult given the size of the Linux kernel and the lack of an exhaustive catalog of all reference counting bugs. Our validation instead relies on manual analysis of reported bugs and the acceptance of our patches by kernel maintainers, which provides evidence that the detected issues are real. In the revised manuscript, we will expand §3 to discuss the modeling assumptions and potential sources of false negatives in more detail, including omitted paths and concurrency considerations. revision: yes

  2. Referee: [abstract] Validation of the 29.9% false-positive rate and confirmation of the 424 new bugs is described only at a high level (abstract). The manuscript does not detail the manual or automated process used to classify reports as true positives versus false positives, which is load-bearing for the comparative claim against prior studies.

    Authors: We agree that more details on the classification process would improve the manuscript. The 29.9% false-positive rate was determined through manual inspection of a random sample of the reported bugs by the authors, cross-referenced with kernel documentation and code review. We will revise the abstract and add a new subsection in the evaluation section to describe this process, including the number of reports inspected, the criteria for classifying a report as a true positive (e.g., confirmation of incorrect reference counting leading to potential use-after-free or double-free), and how this compares to prior work. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results from empirical tool application

full rationale

The paper's core chain is the construction of DrvHorn (kernel modeling + slicing to reduce refcount verification to assertions), followed by its execution on the v6.6 platform drivers to produce bug reports and patch submissions. These outputs are direct consequences of running the implemented analyzer on external source code, not reductions of fitted parameters, self-definitions, or self-citation chains back to the same inputs. External validation via merged patches further separates the discovery numbers from any internal tautology. The methodology choices are presented as engineering decisions rather than derived claims that loop on themselves.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends primarily on domain assumptions about accurate kernel modeling rather than new postulated entities or fitted parameters; the contribution is the engineering of the detection method and its empirical application.

axioms (1)
  • domain assumption The Linux driver interface can be leveraged to model reference counting behavior accurately for verification purposes.
    This underpins the reduction of bug detection to assertion checking as described in the abstract.

pith-pipeline@v0.9.0 · 5633 in / 1516 out tokens · 63775 ms · 2026-05-19T17:47:46.499020+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Devicetree Kernel API,https://docs.kernel.org/devicetree/kernel-api. html

  2. [2]

    Linux and the Devicetree,https://docs.kernel.org/devicetree/usage-model. html

  3. [3]

    LLVM’s Analysis and Transform Passes,https://llvm.org/docs/Passes.html

  4. [4]

    CVE-2019-11487,https://cve.mitre.org/cgi-bin/cvename.cgi? name=CVE-2019-11487

    MITRE. CVE-2019-11487,https://cve.mitre.org/cgi-bin/cvename.cgi? name=CVE-2019-11487

  5. [5]

    CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

    MITRE. CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192

  6. [6]

    In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

    Bai, S., Zhang, Z., Hu, H.: Countdown: Refcount-guided fuzzing for exposing tem- poral memory errors in linux kernel. In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. pp. 1315–1329 (2024)

  7. [7]

    ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

    Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., On- drusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)

  8. [8]

    Communications of the ACM54(7), 68–76 (2011)

    Ball, T., Levin, V., Rajamani, S.K.: A decade of software model checking with SLAM. Communications of the ACM54(7), 68–76 (2011)

  9. [9]

    In: Fields of Logic and Computation II: Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, pp

    Bjørner, N., Gurfinkel, A., McMillan, K., Rybalchenko, A.: Horn clause solvers for program verification. In: Fields of Logic and Computation II: Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, pp. 24–51. Springer (2015)

  10. [10]

    In: International Conference on Tools and Algorithms for the Con- struction and Analysis of Systems

    Emmi, M., Jhala, R., Kohler, E., Majumdar, R.: Verifying reference counting im- plementations. In: International Conference on Tools and Algorithms for the Con- struction and Analysis of Systems. pp. 352–367. Springer (2009)

  11. [11]

    In: International Conference on Computer Aided Verification

    Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: International Conference on Computer Aided Verification. pp. 343–

  12. [12]

    In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24

    Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of C/C++ programs. In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24. pp. 148–168. Springer (2017)

  13. [13]

    In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10

    Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software verification with BLAST. In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10. pp. 235–239. Springer (2003)

  14. [14]

    The Journal of Symbolic Logic16(1), 14–21 (1951)

    Horn, A.: On sentences which are true of direct unions of algebras1. The Journal of Symbolic Logic16(1), 14–21 (1951)

  15. [15]

    In: Linux Symposium

    Kroah-Hartman, G.: kobjects and krefs. In: Linux Symposium. p. 295 (2004)

  16. [16]

    In: International symposium on code generation and opti- mization, 2004

    Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program anal- ysis & transformation. In: International symposium on code generation and opti- mization, 2004. CGO 2004. pp. 75–86. IEEE (2004) Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers 13

  17. [17]

    In: 31st USENIX Security Symposium (USENIX Security 22)

    Liu, J., Yi, L., Chen, W., Song, C., Qian, Z., Yi, Q.: LinKRID: Vetting imbal- ance reference counting in linux kernel with symbolic execution. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 125–142 (2022)

  18. [18]

    In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

    Mao, J., Chen, Y., Xiao, Q., Shi, Y.: RID: finding reference count bugs with in- consistent path pair checking. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. pp. 531–544 (2016)

  19. [19]

    In: International Workshop on Parallel and Distributed Methods in Verification

    Mühlberg, J.T., Lüttgen, G.: Blasting linux code. In: International Workshop on Parallel and Distributed Methods in Verification. pp. 211–226. Springer (2006)

  20. [20]

    In: International conference on integrated formal methods

    Post, H., Küchlin, W.: Integrated static analysis for linux device driver verification. In: International conference on integrated formal methods. pp. 518–537. Springer (2007)

  21. [21]

    Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

    Post, H., Sinz, C., Küchlin, W.: Towards automatic software model checking of thousands of linux modules—a case study with Avinux. Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)

  22. [22]

    In: 2022 Formal Methods in Computer-Aided Design (FMCAD)

    Priya, S., Su, Y., Bao, Y., Zhou, X., Vizel, Y., Gurfinkel, A.: Bounded model check- ing for LLVM. In: 2022 Formal Methods in Computer-Aided Design (FMCAD). pp. 214–224. TU Wien Academic Press (2022)

  23. [23]

    q6wcss"); qcom_add_pdm_subdev(rproc, &wcss->pdm_subdev); qcom_add_ssr_subdev(rproc, &wcss->ssr_subdev,

    Tan,X.,Zhang,Y.,Yang,X.,Lu,K.,Yang,M.:Detectingkernelrefcountbugswith {Two-Dimensional}consistency checking. In: 30th USENIX Security Symposium (USENIX Security 21). pp. 2471–2488 (2021) 14 Joe Hattori , Naoki Kobayashi , and Ken Sakayori Appendix A Patch A.1 Commit 5d8e297 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example fixes a ...