pith. sign in

arxiv: 2606.26177 · v1 · pith:PH3KJ3JInew · submitted 2026-06-24 · 💻 cs.AR

CVA6-RT: an Open-Source Time-Predictable RV64 Processor for Mixed-Criticality Systems

Pith reviewed 2026-06-26 01:00 UTC · model grok-4.3

classification 💻 cs.AR
keywords CVA6-RTRISC-V processorreal-time extensionsmixed-criticality systemsinterrupt latencyTLB partitioningscratchpad cachetime predictability
0
0 comments X

The pith

CVA6-RT adds TLB locks, scratchpad caches and hardware context stacking to the CVA6 core for 12-cycle interrupt latency in real-time use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CVA6-RT as a real-time micro-architectural extension of the open-source CVA6 RISC-V processor aimed at mixed-criticality systems. It adds TLB partitioning and locking to control address translation timing, a mode that turns L1 caches into scratchpads for fixed memory access, and an enhanced interrupt controller with hardware context stacking. These changes target bounded worst-case latencies and lower timing variability so that a full 64-bit core can meet real-time requirements. The central measured outcome is that interrupt latency reaches 12 cycles when the features are active, matching basic Arm Cortex-M cores and cutting the baseline CVA6 latency by a factor of 10.

Core claim

CVA6-RT implements the rv64gch ISA and features advanced support for real-time execution, including TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and low-latency interrupt handling via an enhanced interrupt controller combined with hardware-assisted context stacking. With real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles, comparable to that of simpler Arm Cortex-M microcontrollers, and 10x lower than the baseline CVA6 core.

What carries the argument

The set of micro-architectural extensions consisting of TLB partitioning and locking, dynamically reconfigurable scratchpad L1 caches, and an enhanced interrupt controller with hardware-assisted context stacking.

If this is right

  • Worst-case execution times for critical tasks become bounded even with complex memory systems active.
  • Interrupt response on a 64-bit open-source core reaches speeds typical of simpler microcontrollers.
  • The processor supports mode switching between high-performance and deterministic operation.
  • Open-source RISC-V cores gain practical use in systems that previously required proprietary real-time hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The scratchpad reconfiguration might allow runtime allocation of deterministic memory regions to the most critical tasks.
  • Comparable extensions could be explored on other open RISC-V cores to close the predictability gap with commercial parts.
  • Full-system tests with actual mixed-criticality applications would be needed to confirm the reported bounds hold end-to-end.

Load-bearing premise

The added hardware features will bound worst-case execution latency and reduce timing variability in actual mixed-criticality workloads without introducing new sources of unpredictability or unacceptable overhead.

What would settle it

Running the processor with real-time features enabled on representative mixed-criticality benchmarks and measuring an interrupt latency above 12 cycles or higher timing variability than the baseline.

Figures

Figures reproduced from arXiv: 2606.26177 by Alessandro Ottaviano, Angelo Garofalo, Christopher Reinwardt, Enrico Zelioli, Luca Benini, Nils Wistoff, Robert Balas.

Figure 1
Figure 1. Figure 1: CVA6-RT block diagram with enhanced modules highlighted. • Runtime-configurable L1 instruction and data cache resources in scratchpad mode for determin￾istic memory access latency; • Deterministic low-latency interrupt handling via an enhanced RISC-V CLIC and hardware-assisted register stacking for fast context switch. Using interrupt latency as a representative use case, we show that CVA6-RT achieves 12 c… view at source ↗
Figure 2
Figure 2. Figure 2: Average interrupt latency breakdown. a hybrid cache/scratchpad (SPM) mode in the L1 in￾struction and data caches. Each cache way can be dynamically configured either as a conventional cache way or as software-managed scratchpad memory. Ways assigned to SPM are removed from the cache replace￾ment logic, and their tags and valid bits are cleared to prevent unintended cache hits. Address decoding logic in the… view at source ↗
read the original abstract

This work presents CVA6-RT, a real-time micro-architectural extension of the CVA6 core to bound worst-case latency and reduce task's timing execution variability. CVA6-RT implements the rv64gch ISA and features advanced support for real-time execution, including TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and low-latency interrupt handling via an enhanced interrupt controller combined with hardware-assisted context stacking. With real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles, comparable to that of simpler Arm Cortex-M microcontrollers, and 10x lower than the baseline CVA6 core.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. This paper presents CVA6-RT, a real-time micro-architectural extension of the open-source CVA6 RV64 core implementing the rv64gch ISA. It adds TLB partitioning and locking for predictable address translation, a dynamically reconfigurable scratchpad mode in the L1 caches for deterministic memory access, and an enhanced interrupt controller with hardware-assisted context stacking. The central quantitative claim is that, with real-time features enabled, CVA6-RT achieves an interrupt latency of 12 cycles—comparable to simpler Arm Cortex-M microcontrollers and 10x lower than the baseline CVA6 core—for use in mixed-criticality systems.

Significance. If the performance claims are substantiated, the work would be significant for delivering an open-source, time-predictable RISC-V processor suitable for safety-critical and mixed-criticality applications. The specific 10x reduction in interrupt latency and the combination of features for bounding worst-case latency represent a practical contribution to the field of predictable hardware.

major comments (2)
  1. [Abstract] Abstract: The claim of a 12-cycle interrupt latency is presented without any measurement methodology, benchmarks, error analysis, or verification approach. This prevents assessment of whether the data support the central performance claim.
  2. [Micro-architectural extensions] The descriptions of TLB partitioning/locking, dynamically reconfigurable scratchpad L1 caches, and the enhanced interrupt controller do not include analysis demonstrating that these extensions bound worst-case execution latency and reduce timing variability without introducing new sources of unpredictability or unacceptable overhead in mixed-criticality workloads.
minor comments (2)
  1. The abstract and introduction would benefit from explicit cross-references to the sections containing the evaluation methodology and results that support the 12-cycle latency figure.
  2. Consider clarifying the exact configuration of the baseline CVA6 used for the 10x comparison (e.g., cache sizes, pipeline details) to allow direct reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline the planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim of a 12-cycle interrupt latency is presented without any measurement methodology, benchmarks, error analysis, or verification approach. This prevents assessment of whether the data support the central performance claim.

    Authors: We agree that the abstract would benefit from additional context. The full manuscript details the measurement approach in the evaluation section, using cycle-accurate RTL simulation, targeted interrupt-injection benchmarks, and trace-based verification. We will revise the abstract to concisely reference the simulation environment and benchmark methodology supporting the 12-cycle figure. revision: yes

  2. Referee: [Micro-architectural extensions] The descriptions of TLB partitioning/locking, dynamically reconfigurable scratchpad L1 caches, and the enhanced interrupt controller do not include analysis demonstrating that these extensions bound worst-case execution latency and reduce timing variability without introducing new sources of unpredictability or unacceptable overhead in mixed-criticality workloads.

    Authors: The manuscript includes design rationale and initial quantitative results on reduced variability. We acknowledge, however, that a more explicit analysis of worst-case latency bounding, potential new unpredictability sources, and overhead under mixed-criticality workloads is needed. We will add a dedicated subsection with this analysis and supporting experiments in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The document is a hardware design description of micro-architectural extensions to CVA6. It presents implementation features (TLB partitioning, scratchpad caches, interrupt controller) and states measured results such as 12-cycle interrupt latency. No equations, parameter fitting, predictions derived from inputs, or self-citation chains appear in the provided abstract or described content. The central claims rest on engineering implementation and benchmarking rather than any derivation that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model, free parameters, axioms, or invented entities are described; the contribution is a hardware microarchitecture extension.

pith-pipeline@v0.9.1-grok · 5680 in / 1103 out tokens · 40058 ms · 2026-06-26T01:00:36.328775+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

4 extracted references

  1. [1]

    A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems

    G. Gracioli et al. “A Survey on Cache Management Mecha- nisms for Real-Time Embedded Systems”. In:ACM Com- put. Surv. 48.2 (2015)

  2. [2]

    Stellar Automotive Microcontrollers

    STMicroelectronics. Stellar Automotive Microcontrollers. 2023

  3. [3]

    A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors

    ARM Limited. A Beginner’s Guide on Interrupt Latency of the Arm Cortex-M processors . 2016

  4. [4]

    The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology

    F. Zaruba and L. Benini. “The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux- Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology”. In:IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27.11 (2019). 2 RISC-V Summit Europe, Bologna, 8-12th June 2026