pith. sign in

arxiv: 2604.23331 · v1 · submitted 2026-04-25 · 💻 cs.CR

Branch Landing: Bloom Filter-Based Source Authorization for Forward-Edge CFI on RISC-V

Pith reviewed 2026-05-08 07:57 UTC · model grok-4.3

classification 💻 cs.CR
keywords control-flow integrityforward-edge CFIBloom filtersource authorizationRISC-Vjump-oriented programmingISA extensionsbranch landing
0
0 comments X

The pith

Bloom filters replace rigid tags to authorize indirect branch sources in RISC-V forward-edge CFI with under half a percent overhead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Branch Landing to give forward-edge CFI precise source authorization that type-based schemes lack and that tag-based hardware cannot scale. Two new instructions carry a source section identifier through a dedicated register and test it at each landing site using a Bloom filter query whose latency stays fixed regardless of how many sources are allowed. A sympathetic reader would care because this lets the same mechanism run either a permissive type-based policy or a tighter CFG-derived one while adding almost no runtime or code size cost. If the approach holds, programs gain stronger protection against jump-oriented programming without the usual hardware or performance trade-offs that limit current defenses.

Core claim

Branch Landing replaces fixed-capacity checks with Bloom filter membership queries at each landing site. The bld and brl instructions propagate a source Section Identifier through a dedicated BRState register and validate it with fixed-probe latency independent of the number of authorized sources under a chosen filter configuration. Section granularity is configurable, supporting policies from type-based to CFG-derived authorization within a single mechanism.

What carries the argument

The brl instruction that performs a Bloom filter membership query on the source Section Identifier to validate authorized callers at landing sites.

If this is right

  • The same mechanism supports both type-based and CFG-derived authorization policies at section granularity.
  • Average runtime overhead stays at 0.210 percent and 0.421 percent under a 3-cycle latency model for the two policies.
  • Mean code size growth remains at 0.46 percent and 0.52 percent respectively.
  • The CFG-derived policy shrinks the average equivalence class size by 32.5 percent compared with the type-based policy.
  • All evaluated executions complete without any enforcement failures from the checks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Fixed check latency independent of authorized-set size could suit real-time or timing-sensitive code where variable delays are unacceptable.
  • The configurable section granularity provides a single knob for trading authorization precision against implementation cost across different threat models.
  • Similar Bloom-filter checks might extend source authorization to other indirect transfers such as returns or calls if the register state and instructions are added.

Load-bearing premise

Bloom filter false positives can be kept low enough under chosen configurations to avoid both security bypasses and unacceptable performance penalties when the number of authorized sources grows.

What would settle it

An execution in which an unauthorized source passes the Bloom filter check at a landing site, or measurements showing runtime overheads that grow sharply once the number of valid sources increases.

Figures

Figures reproduced from arXiv: 2604.23331 by Peter Beerel, You Wu.

Figure 1
Figure 1. Figure 1: Existing and proposed forward-edge CFI defenses. (a) Type-based CFI (FineIBT): all callers sharing the same function type pass the check, including attacker￾controlled foo5. (b) Tag-based hardware CFI (Bratter): the 4-slot BTR is exhausted by foo1–foo4; no slot is available to express a policy for foo5. (c) Branch Landing: source SIDs are checked via Bloom filter membership, supporting an arbitrary number … view at source ↗
Figure 2
Figure 2. Figure 2: Weighted performance overhead under BRL-Func and BRL-CFG policies with 3-cycle, 5-cycle, and 10-cycle brl latency assumptions and a 1-cycle bld. of Bratter [23], which assigns 2 cycles to each sbtag/cbtag instruction. We assign 1 cycle to ALU and bld (immediate-to-CSR write), 2 to taken branches, 3 to load/store, and 10 to ecall. For brl, we evaluate three latency points— brl3, brl5, and brl10. We detail i… view at source ↗
read the original abstract

Jump-Oriented Programming (JOP) attacks exploit indirect control transfers to bypass backward-edge defenses, yet existing forward-edge CFI mechanisms lack precise source-domain authorization: type-based CFI admits all same-signature callers, while tag-based hardware CFI is limited by fixed-width register storage that caps the number of simultaneously authorized sources. We propose Branch Landing (BRL), a landing-based forward-edge CFI framework for RISC-V that replaces fixed-capacity checks with Bloom filter membership queries. Two lightweight ISA extensions, bld and brl, propagate a source Section Identifier (SID) through a dedicated BRState register and validate it at each landing site with fixed-probe latency that is independent of the number of authorized sources under a chosen filter configuration. Section granularity is configurable, supporting policies from type-based to CFG-derived authorization within a single mechanism. We implement Branch Landing in the LLVM RISC-V backend and evaluate it on 81 BEEBS benchmarks under two representative policy configurations: a function-level, type-based policy and a basic-block-level, CFG-derived policy. Under a 3-cycle brl latency model, the two configurations incur average runtime overheads of only 0.210% and 0.421%, with mean code size growth of 0.46% and 0.52% respectively. The CFG-derived policy reduces the average equivalence class size by 32.5% compared to the type-based policy, and all evaluated executions complete without BRL enforcement failures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes Branch Landing (BRL), a forward-edge CFI framework for RISC-V that uses Bloom filters for source authorization at landing sites. It introduces two ISA extensions (bld and brl) plus a BRState register to propagate and check Section Identifiers (SIDs) with fixed probe latency independent of the number of authorized sources. The mechanism supports configurable policies ranging from type-based to CFG-derived authorization. The LLVM RISC-V backend implementation is evaluated on 81 BEEBS benchmarks under a 3-cycle brl latency model, reporting average runtime overheads of 0.210% (type-based) and 0.421% (CFG-derived), code-size growth of 0.46% and 0.52%, and a 32.5% reduction in average equivalence-class size for the CFG policy, with no observed enforcement failures.

Significance. If the security properties hold, the work provides a practical, scalable solution to the source-authorization problem in forward-edge CFI by removing the fixed-width register limit of prior tag-based schemes while retaining low overhead. The direct LLVM implementation, concrete benchmark numbers on 81 programs, and policy-flexibility demonstration are strengths that would make the result useful for RISC-V security research and deployment.

major comments (2)
  1. [Evaluation] The central security claim—that Bloom-filter membership queries enforce precise source authorization—rests on the assumption that false-positive rates remain negligible as equivalence-class sizes grow. However, no section supplies the concrete Bloom-filter parameters (m, k), the maximum observed class size under the CFG-derived policy, or the resulting theoretical false-positive probability (1-(1-1/m)^{kn})^k. The evaluation only states that the 81 runs completed without enforcement failures; this does not bound the worst-case bypass risk.
  2. [Evaluation] The reported overheads (0.210% and 0.421%) and the claim of “fixed-probe latency independent of the number of authorized sources” are conditioned on a 3-cycle brl latency model. No sensitivity analysis or justification for this model appears, and the performance numbers would change materially under different hardware assumptions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major evaluation concern point-by-point below and will revise the paper accordingly to strengthen the presentation of security analysis and performance modeling.

read point-by-point responses
  1. Referee: [Evaluation] The central security claim—that Bloom-filter membership queries enforce precise source authorization—rests on the assumption that false-positive rates remain negligible as equivalence-class sizes grow. However, no section supplies the concrete Bloom-filter parameters (m, k), the maximum observed class size under the CFG-derived policy, or the resulting theoretical false-positive probability (1-(1-1/m)^{kn})^k. The evaluation only states that the 81 runs completed without enforcement failures; this does not bound the worst-case bypass risk.

    Authors: We agree that explicit Bloom-filter parameters and a theoretical false-positive bound would strengthen the security argument. The current manuscript emphasizes the empirical result of zero enforcement failures across all 81 BEEBS benchmarks under both policies, but we acknowledge this does not formally quantify worst-case risk as equivalence classes scale. In the revised version we will add a new subsection that states the concrete values of m and k used in the LLVM implementation, reports the maximum observed equivalence-class size for the CFG-derived policy, and computes the corresponding false-positive probability via the standard formula. This addition will directly bound the bypass probability while retaining the existing empirical validation. revision: yes

  2. Referee: [Evaluation] The reported overheads (0.210% and 0.421%) and the claim of “fixed-probe latency independent of the number of authorized sources” are conditioned on a 3-cycle brl latency model. No sensitivity analysis or justification for this model appears, and the performance numbers would change materially under different hardware assumptions.

    Authors: The 3-cycle brl latency model reflects a conservative estimate of a parallel Bloom-filter probe unit (small k, constant-time hash lookups) integrated into a typical in-order RISC-V pipeline. We recognize that the absolute overhead figures are model-dependent and that a sensitivity study would improve clarity. In the revision we will insert a short justification paragraph for the chosen latency together with a sensitivity table or plot showing how runtime overhead scales for brl latencies of 1–5 cycles. The core claim of fixed probe latency independent of source count remains unchanged, as it follows from the Bloom-filter design rather than the specific cycle count. revision: yes

Circularity Check

0 steps flagged

No circularity: core results are empirical measurements from implementation and benchmarks, not reductions to fitted inputs or self-referential definitions.

full rationale

The paper's load-bearing claims (0.210%/0.421% overheads, 0.46%/0.52% code growth, 32.5% equivalence-class reduction) are obtained directly from LLVM RISC-V backend implementation and execution on 81 BEEBS benchmarks under an explicit 3-cycle brl latency model. No equations, derivations, or predictions are presented that reduce by construction to fitted parameters, self-citations, or ansatzes. The Bloom-filter membership check and SID propagation via bld/brl are described as a concrete mechanism with configurable section granularity; security properties are asserted via observed lack of enforcement failures in the evaluated runs rather than any closed-form derivation that loops back to the inputs. The noted assumption on false-positive rates is a completeness issue, not a circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The design rests on standard Bloom-filter false-positive bounds and the assumption that section identifiers can be assigned without excessive aliasing; no new physical constants or fitted parameters are introduced in the abstract.

axioms (1)
  • domain assumption Bloom filters can be sized and hashed to keep false-positive probability low for the expected number of authorized sources at each landing site.
    Invoked to guarantee both security and constant-time checks.
invented entities (2)
  • BRState register no independent evidence
    purpose: Holds the source Section Identifier (SID) for propagation to landing sites.
    New architectural state required by the bld/brl mechanism.
  • bld and brl instructions no independent evidence
    purpose: Load SID into BRState and perform Bloom-filter check at landing sites.
    New ISA extensions that realize the authorization primitive.

pith-pipeline@v0.9.0 · 5565 in / 1420 out tokens · 36902 ms · 2026-05-08T07:57:02.974017+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 3 canonical work pages

  1. [1]

    ACM Transactions on Information and System Security (TISSEC)13(1), 1–40 (2009)

    Abadi, M., Budiu, M., Erlingsson, U., Ligatti, J.: Control-flow integrity: Principles, implementations, and applications. ACM Transactions on Information and System Security (TISSEC)13(1), 1–40 (2009)

  2. [2]

    Arm Ltd.: Arm® Architecture Reference Manual for A-profile Architecture (2020), includes Branch Target Identification (BTI) extension introduced in Armv8.5-A

  3. [3]

    In: Proceedings of the 6th ACM Symposium on Infor- mation, Computer and Communications Security (ASIACCS)

    Bletsch, T., Jiang, X., Freeh, V.W., Liang, Z.: Jump-oriented programming: A new class of code-reuse attack. In: Proceedings of the 6th ACM Symposium on Infor- mation, Computer and Communications Security (ASIACCS). pp. 30–40. ACM (2011)

  4. [4]

    Commu- nications of the ACM13(7), 422–426 (1970)

    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commu- nications of the ACM13(7), 422–426 (1970)

  5. [5]

    ACM Computing Surveys50(1), 1–33 (2017)

    Burow, N., Carr, S.A., Nash, J., Larsen, P., Franz, M., Brunthaler, S., Payer, M.: Control-flow integrity: Precision, security, and performance. ACM Computing Surveys50(1), 1–33 (2017)

  6. [6]

    In: Proceed- ings of the 40th IEEE Symposium on Security and Privacy (S&P)

    Burow, N., Zhang, X., Payer, M.: SoK: Shining light on shadow stacks. In: Proceed- ings of the 40th IEEE Symposium on Security and Privacy (S&P). pp. 985–999. IEEE (2019)

  7. [7]

    In: Proceedings of the 24th USENIX SecuritySymposium(USENIXSecurity).pp.161–176.USENIXAssociation(2015)

    Carlini, N., Barresi, A., Payer, M., Wagner, D., Gross, T.R.: Control-flow bending: On the effectiveness of control-flow integrity. In: Proceedings of the 24th USENIX SecuritySymposium(USENIXSecurity).pp.161–176.USENIXAssociation(2015)

  8. [8]

    In: Proceedings of the ninth annual ACM symposium on Theory of computing

    Carter, J.L., Wegman, M.N.: Universal classes of hash functions. In: Proceedings of the ninth annual ACM symposium on Theory of computing. pp. 106–112 (1977)

  9. [9]

    In: Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS)

    Checkoway, S., Davi, L., Dmitrienko, A., Sadeghi, A.R., Shacham, H., Winandy, M.: Return-oriented programming without returns. In: Proceedings of the 17th ACM Conference on Computer and Communications Security (CCS). pp. 559–

  10. [10]

    In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy

    Christoulakis, N., Christou, G., Athanasopoulos, E., Ioannidis, S.: Hcfi: Hardware- enforced control-flow integrity. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy. pp. 38–49 (2016)

  11. [11]

    In: Proceedings of the 20th ACM International Conference on Computing Fron- tiers

    Fuguet, C.: Hpdcache: Open-source high-performance l1 data cache for risc-v cores. In: Proceedings of the 20th ACM International Conference on Computing Fron- tiers. pp. 377–378 (2023)

  12. [12]

    In: Pro- ceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses

    Gaidis, A.J., Moreira, J., Sun, K., Milburn, A., Atlidakis, V., Kemerlis, V.P.: Fineibt: Fine-grain control-flow enforcement with indirect branch tracking. In: Pro- ceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses. pp. 527–546 (2023)

  13. [13]

    In: IEEE Southeastcon 2009

    Harwayne-Gidansky, J., Stefan, D., Dalal, I.: Fpga-based soc for real-time network intrusion detection using counting bloom filters. In: IEEE Southeastcon 2009. pp. 452–458. IEEE (2009)

  14. [14]

    In: Proceedings of the 37th IEEE Symposium on Security and Privacy (S&P)

    Hu, H., Shinde, S., Adrian, S., Chua, Z.L., Saxena, P., Liang, Z.: Data-oriented programming: On the expressiveness of non-control data attacks. In: Proceedings of the 37th IEEE Symposium on Security and Privacy (S&P). pp. 969–986. IEEE (2016)

  15. [15]

    Intel Corporation: Intel® control-flow enforcement technology. Tech. rep., Intel Corporation (2019), intel Architecture Instruction Set Extensions Programming Reference

  16. [16]

    In: Proceedings of the 27th International Symposium Branch Landing 21 on Research in Attacks, Intrusions and Defenses (RAID)

    Kasten, F., Zieris, P., Horsch, J.: Integrating static analyses for high-precision control-flow integrity. In: Proceedings of the 27th International Symposium Branch Landing 21 on Research in Attacks, Intrusions and Defenses (RAID). ACM (2024). https://doi.org/10.1145/3678890.3678920

  17. [17]

    Random Structures & Algorithms33(2), 187–218 (2008)

    Kirsch, A., Mitzenmacher, M.: Less hashing, same performance: Building a better bloom filter. Random Structures & Algorithms33(2), 187–218 (2008)

  18. [18]

    https://clang.llvm.org/docs/ControlFlowIntegrity.html (2023)

    LLVM Project: Clang control flow integrity. https://clang.llvm.org/docs/ControlFlowIntegrity.html (2023)

  19. [19]

    CVA6-CFI: A First Glance at RISC-V Control-Flow Integrity Extensions,

    Manoni, S., Parisi, E., Tedeschi, R., Rossi, D., Acquaviva, A., Bartolini, A.: Cva6-cfi: A first glance at risc-v control-flow integrity extensions. arXiv preprint arXiv:2602.04991 (2026)

  20. [20]

    Windows XP SP2 Security Technologies (2003), also known as WˆX (write XOR execute)

    Microsoft Corporation: Data execution prevention. Windows XP SP2 Security Technologies (2003), also known as WˆX (write XOR execute)

  21. [21]

    In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

    Niu, B., Tan, G.: Modular control-flow integrity. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). pp. 577–587. ACM (2014)

  22. [22]

    BEEBS: Open Benchmarks for Energy Measurements on Embedded Platforms

    Pallister, J., Hollis, S., Bennett, J.: BEEBS: Open benchmarks for energy mea- surements on embedded platforms. In: arXiv preprint arXiv:1308.5174 (2013)

  23. [23]

    Sensors22(4), 1392 (2022)

    Park, S., Kang, D., Kang, J., Kwon, D.: Bratter: An instruction set extension for forward control-flow integrity in RISC-V. Sensors22(4), 1392 (2022)

  24. [24]

    https://pax.grsecurity.net/docs/aslr.txt (2003)

    PaX Team: PaX address space layout randomization (ASLR). https://pax.grsecurity.net/docs/aslr.txt (2003)

  25. [25]

    RISC-V International: Control-flow integrity (CFI) extension for RISC-V. Tech. rep., RISC-V International (2023), draft specification

  26. [26]

    ACM Transactions on Information and System Security (TISSEC)15(1), 1–34 (2012)

    Roemer, R., Buchanan, E., Shacham, H., Savage, S.: Return-oriented program- ming: Systems, languages, and applications. ACM Transactions on Information and System Security (TISSEC)15(1), 1–34 (2012)

  27. [27]

    Microprocessors and Microsystems93, 104619 (2022)

    Sateesan,A.,Vliegen,J.,Daemen,J.,Mentens,N.:Hardware-orientedoptimization of bloom filter algorithms and architectures for ultra-high-speed lookups in network applications. Microprocessors and Microsystems93, 104619 (2022)

  28. [28]

    In: Proceedings of the 36th IEEE Symposium on Security and Privacy (S&P)

    Schuster, F., Tendyck, T., Liebchen, C., Davi, L., Sadeghi, A.R., Holz, T.: Coun- terfeit object-oriented programming: On the difficulty of preventing code reuse attacks in C++ applications. In: Proceedings of the 36th IEEE Symposium on Security and Privacy (S&P). pp. 745–762. IEEE (2015)

  29. [29]

    In: Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS)

    Shacham, H., Page, M., Pfaff, B., Goh, E.J., Modadugu, N., Boneh, D.: On the effectiveness of address-space randomization. In: Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS). pp. 298–307. ACM (2004)

  30. [30]

    In: Proceedings of the 8th In- ternational Workshop on Hardware and Architectural Support for Security and Privacy (HASP)

    Shanbhogue, V., Gupta, D., Sahita, R.: Security analysis of processor instruction set architecture for enforcing control-flow integrity. In: Proceedings of the 8th In- ternational Workshop on Hardware and Architectural Support for Security and Privacy (HASP). pp. 1–8. ACM (2019)

  31. [31]

    In: Proceedings of the 23rd USENIX Security Symposium (USENIX Security)

    Tice, C., Roeder, T., Collingbourne, P., Checkoway, S., Erlingsson, U., Lozano, L., Pike, G.: Enforcing forward-edge control-flow integrity in GCC and LLVM. In: Proceedings of the 23rd USENIX Security Symposium (USENIX Security). pp. 941–955. USENIX Association (2014)

  32. [32]

    RISC-V Foundation (2019), document Version 20191213

    Waterman, A., Asanović, K.: The RISC-V Instruction Set Manual, Volume I: Un- privileged ISA. RISC-V Foundation (2019), document Version 20191213

  33. [33]

    In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

    Xiang,H.,Cheng,Z.,Li,J.,Ma,J.,Lu,K.:Boostingpracticalcontrol-flowintegrity with complete field sensitivity and origin awareness. In: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. pp. 4524–4538 (2024) 22 Y. Wu et al

  34. [34]

    IEEE Transactions on Very Large Scale Integration (VLSI) Systems27(11), 2629– 2640 (2019)

    Zaruba, F., Benini, L.: The cost of application-class processing: Energy and perfor- mance analysis of a linux-ready 1.7-ghz 64-bit risc-v core in 22-nm fdsoi technology. IEEE Transactions on Very Large Scale Integration (VLSI) Systems27(11), 2629– 2640 (2019)

  35. [35]

    In: Proceedings of the 34th IEEE Symposium on Security and Privacy (S&P)

    Zhang, C., Wei, T., Chen, Z., Duan, L., Szekeres, L., McCamant, S., Song, D., Zou, W.: Practical control flow integrity and randomization for binary executables. In: Proceedings of the 34th IEEE Symposium on Security and Privacy (S&P). pp. 559–573. IEEE (2013)