Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers
Pith reviewed 2026-05-19 17:47 UTC · model grok-4.3
The pith
DrvHorn detects reference counting bugs in Linux kernel drivers by reducing verification to assertion checking and finds 545 issues including 424 unknown ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that by leveraging the Linux driver interface to reduce reference counting verification to an assertion checking problem, combined with aggressive program slicing, DrvHorn can automatically find reference counting bugs in kernel drivers. When applied to all platform drivers in version 6.6 of the Linux kernel, the tool located 545 such bugs of which 424 were previously unknown, achieving a lower false positive rate of 29.9% than prior work. Patches were submitted to fix the root causes of the new bugs, resulting in 45 merges.
What carries the argument
Reduction of reference counting verification to an assertion checking problem using the Linux driver interface, enabled by kernel modeling and program slicing.
If this is right
- The approach identifies a large number of real bugs that can be patched to improve kernel stability.
- Lower false positive rates make the tool more practical for developers to use on large codebases.
- Submitting and merging patches demonstrates the tool's utility in real-world kernel maintenance.
- Similar modeling techniques could be applied to detect other classes of bugs in operating system code.
Where Pith is reading between the lines
- Integrating DrvHorn-like tools into continuous integration for kernel development might catch bugs earlier.
- Reference counting issues may be more widespread in driver code than previously detected by manual or less efficient methods.
- Extending the method to other kernel subsystems beyond platform drivers could reveal additional vulnerabilities.
Load-bearing premise
The modeling of the Linux kernel and the reduction of reference counting verification to an assertion checking problem accurately captures real driver behavior without introducing significant false negatives or inaccuracies in the analysis.
What would settle it
Manually auditing a random sample of 50 reported bugs to measure the true false positive rate and comparing it to 29.9%, or checking detection on a driver with a known reference counting bug that was fixed.
Figures
read the original abstract
Reference counting bugs in Linux kernel drivers can lead to severe resource mismanagement and security vulnerabilities. We introduce DrvHorn, a novel automated tool to detect these bugs by reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing, DrvHorn discovered 545 bugs, of which 424 were previously unknown, across all platform drivers in v6.6 Linux kernel, with a lower false positive rate of 29.9% compared to prior studies. To address the root causes of these newly discovered bugs, we submitted patches to the Linux kernel, and 45 of them were merged.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DrvHorn, a tool that models the Linux kernel driver interface and applies aggressive program slicing to reduce reference-counting bug detection to an assertion-checking problem. Applied to all platform drivers in Linux v6.6, the tool reports 545 bugs (424 previously unknown) at a 29.9% false-positive rate, with 45 patches submitted and merged.
Significance. If the kernel model and slicing are faithful, the work would be a meaningful advance in automated detection of a high-impact class of kernel bugs, evidenced by the volume of new bugs found, the lower FP rate relative to prior studies, and the concrete outcome of merged patches. The approach of reducing refcount verification to assertions is a clean technical idea that could be reusable.
major comments (2)
- [abstract and §3] The central claim that DrvHorn discovered 424 new bugs rests on the fidelity of the kernel model and slicing (abstract and §3). No independent recall evaluation against a ground-truth set of known reference-counting bugs is reported, leaving open the possibility that omitted paths, concurrent accesses, or driver-specific entry points produce unmeasured false negatives.
- [abstract] Validation of the 29.9% false-positive rate and confirmation of the 424 new bugs is described only at a high level (abstract). The manuscript does not detail the manual or automated process used to classify reports as true positives versus false positives, which is load-bearing for the comparative claim against prior studies.
minor comments (2)
- [evaluation] The abstract states results for 'all platform drivers' but the precise scope of the analyzed codebase (e.g., which drivers or subsystems were excluded by slicing) should be stated explicitly in the evaluation section.
- [§3.1] Notation for the modeled reference-counting APIs and the assertion templates could be clarified with a small table or example in §3.1 to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the potential impact of DrvHorn and for the detailed comments. We address each of the major comments below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [abstract and §3] The central claim that DrvHorn discovered 424 new bugs rests on the fidelity of the kernel model and slicing (abstract and §3). No independent recall evaluation against a ground-truth set of known reference-counting bugs is reported, leaving open the possibility that omitted paths, concurrent accesses, or driver-specific entry points produce unmeasured false negatives.
Authors: We acknowledge that the absence of a recall evaluation against a complete ground-truth set is a limitation of our evaluation. Constructing such a ground truth is inherently difficult given the size of the Linux kernel and the lack of an exhaustive catalog of all reference counting bugs. Our validation instead relies on manual analysis of reported bugs and the acceptance of our patches by kernel maintainers, which provides evidence that the detected issues are real. In the revised manuscript, we will expand §3 to discuss the modeling assumptions and potential sources of false negatives in more detail, including omitted paths and concurrency considerations. revision: yes
-
Referee: [abstract] Validation of the 29.9% false-positive rate and confirmation of the 424 new bugs is described only at a high level (abstract). The manuscript does not detail the manual or automated process used to classify reports as true positives versus false positives, which is load-bearing for the comparative claim against prior studies.
Authors: We agree that more details on the classification process would improve the manuscript. The 29.9% false-positive rate was determined through manual inspection of a random sample of the reported bugs by the authors, cross-referenced with kernel documentation and code review. We will revise the abstract and add a new subsection in the evaluation section to describe this process, including the number of reports inspected, the criteria for classifying a report as a true positive (e.g., confirmation of incorrect reference counting leading to potential use-after-free or double-free), and how this compares to prior work. revision: yes
Circularity Check
No significant circularity; results from empirical tool application
full rationale
The paper's core chain is the construction of DrvHorn (kernel modeling + slicing to reduce refcount verification to assertions), followed by its execution on the v6.6 platform drivers to produce bug reports and patch submissions. These outputs are direct consequences of running the implemented analyzer on external source code, not reductions of fitted parameters, self-definitions, or self-citation chains back to the same inputs. External validation via merged patches further separates the discovery numbers from any internal tautology. The methodology choices are presented as engineering decisions rather than derived claims that loop on themselves.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Linux driver interface can be leveraged to model reference counting behavior accurately for verification purposes.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce DrvHorn... reducing reference counting verification to an assertion checking problem leveraging the Linux driver interface. Through efficient modeling of the Linux kernel and aggressive program slicing
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
refcount difference should be zero when initialization fails
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Devicetree Kernel API,https://docs.kernel.org/devicetree/kernel-api. html
-
[2]
Linux and the Devicetree,https://docs.kernel.org/devicetree/usage-model. html
-
[3]
LLVM’s Analysis and Transform Passes,https://llvm.org/docs/Passes.html
-
[4]
CVE-2019-11487,https://cve.mitre.org/cgi-bin/cvename.cgi? name=CVE-2019-11487
MITRE. CVE-2019-11487,https://cve.mitre.org/cgi-bin/cvename.cgi? name=CVE-2019-11487
work page 2019
-
[5]
CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192
MITRE. CVE-2023-7192,https://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2023-7192
work page 2023
-
[6]
In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security
Bai, S., Zhang, Z., Hu, H.: Countdown: Refcount-guided fuzzing for exposing tem- poral memory errors in linux kernel. In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. pp. 1315–1329 (2024)
work page 2024
-
[7]
ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)
Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., On- drusek, B., Rajamani, S.K., Ustuner, A.: Thorough static analysis of device drivers. ACM SIGOPS Operating Systems Review40(4), 73–85 (2006)
work page 2006
-
[8]
Communications of the ACM54(7), 68–76 (2011)
Ball, T., Levin, V., Rajamani, S.K.: A decade of software model checking with SLAM. Communications of the ACM54(7), 68–76 (2011)
work page 2011
-
[9]
Bjørner, N., Gurfinkel, A., McMillan, K., Rybalchenko, A.: Horn clause solvers for program verification. In: Fields of Logic and Computation II: Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, pp. 24–51. Springer (2015)
work page 2015
-
[10]
In: International Conference on Tools and Algorithms for the Con- struction and Analysis of Systems
Emmi, M., Jhala, R., Kohler, E., Majumdar, R.: Verifying reference counting im- plementations. In: International Conference on Tools and Algorithms for the Con- struction and Analysis of Systems. pp. 352–367. Springer (2009)
work page 2009
-
[11]
In: International Conference on Computer Aided Verification
Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: International Conference on Computer Aided Verification. pp. 343–
-
[12]
Gurfinkel, A., Navas, J.A.: A context-sensitive memory model for verification of C/C++ programs. In: Static Analysis: 24th International Symposium, SAS 2017, New York, NY, USA, August 30–September 1, 2017, Proceedings 24. pp. 148–168. Springer (2017)
work page 2017
-
[13]
Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Software verification with BLAST. In: Model Checking Software: 10th International SPIN Workshop Port- land, OR, USA, May 9–10, 2003 Proceedings 10. pp. 235–239. Springer (2003)
work page 2003
-
[14]
The Journal of Symbolic Logic16(1), 14–21 (1951)
Horn, A.: On sentences which are true of direct unions of algebras1. The Journal of Symbolic Logic16(1), 14–21 (1951)
work page 1951
-
[15]
Kroah-Hartman, G.: kobjects and krefs. In: Linux Symposium. p. 295 (2004)
work page 2004
-
[16]
In: International symposium on code generation and opti- mization, 2004
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program anal- ysis & transformation. In: International symposium on code generation and opti- mization, 2004. CGO 2004. pp. 75–86. IEEE (2004) Automatic Detection of Reference Counting Bugs in Linux Kernel Drivers 13
work page 2004
-
[17]
In: 31st USENIX Security Symposium (USENIX Security 22)
Liu, J., Yi, L., Chen, W., Song, C., Qian, Z., Yi, Q.: LinKRID: Vetting imbal- ance reference counting in linux kernel with symbolic execution. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 125–142 (2022)
work page 2022
-
[18]
Mao, J., Chen, Y., Xiao, Q., Shi, Y.: RID: finding reference count bugs with in- consistent path pair checking. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. pp. 531–544 (2016)
work page 2016
-
[19]
In: International Workshop on Parallel and Distributed Methods in Verification
Mühlberg, J.T., Lüttgen, G.: Blasting linux code. In: International Workshop on Parallel and Distributed Methods in Verification. pp. 211–226. Springer (2006)
work page 2006
-
[20]
In: International conference on integrated formal methods
Post, H., Küchlin, W.: Integrated static analysis for linux device driver verification. In: International conference on integrated formal methods. pp. 518–537. Springer (2007)
work page 2007
-
[21]
Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)
Post, H., Sinz, C., Küchlin, W.: Towards automatic software model checking of thousands of linux modules—a case study with Avinux. Software Testing, Verifi- cation and Reliability19(2), 155–172 (2009)
work page 2009
-
[22]
In: 2022 Formal Methods in Computer-Aided Design (FMCAD)
Priya, S., Su, Y., Bao, Y., Zhou, X., Vizel, Y., Gurfinkel, A.: Bounded model check- ing for LLVM. In: 2022 Formal Methods in Computer-Aided Design (FMCAD). pp. 214–224. TU Wien Academic Press (2022)
work page 2022
-
[23]
Tan,X.,Zhang,Y.,Yang,X.,Lu,K.,Yang,M.:Detectingkernelrefcountbugswith {Two-Dimensional}consistency checking. In: 30th USENIX Security Symposium (USENIX Security 21). pp. 2471–2488 (2021) 14 Joe Hattori , Naoki Kobayashi , and Ken Sakayori Appendix A Patch A.1 Commit 5d8e297 Related to Section 4.2 (Kernel Bug Fixes and Contributions). This example fixes a ...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.