pith. the verified trust layer for science. sign in

arxiv: 2604.21124 · v1 · submitted 2026-04-22 · 💻 cs.AR · cs.DC

Enabling Mixed criticality applications for the Versal AI-Engines

Pith reviewed 2026-05-09 22:25 UTC · model grok-4.3

classification 💻 cs.AR cs.DC
keywords mixed-criticality systemsAI Enginedynamic dispatchingVersal SoCreal-time systemstask switchingautonomous driving
0
0 comments X p. Extension

The pith

Dynamic task dispatching infrastructure enables mixed-criticality use of the Versal AI Engine array.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a dynamic task dispatching infrastructure for the Artificial Intelligence Engine in Versal SoCs to support mixed-criticality systems. Static dataflow mappings had previously prevented dynamic assignment of tasks with different criticality levels to the AIE's parallel tiles. The new infrastructure allows runtime task switching and assigns tasks to a shared tile pool based on the current criticality mode. A timing analysis establishes that control logic, context switching, and data copies introduce low-variance overhead negligible to overall execution time. Evaluation on an autonomous-driving workload with variable execution times shows maximized utilization, 65.5 percent less idle time, under 0.002 percent added execution time, and doubled throughput for low-criticality tasks.

Core claim

The authors present a runtime task dispatcher for the AIE array that permits dynamic assignment of mixed-criticality tasks to tiles. This removes the restriction of fixed dataflow mappings and permits the system to reallocate resources when criticality mode changes. Detailed analysis of control, switching, and data movement costs shows they stay small and stable relative to task times. In a concrete autonomous-driving example the scheme raises tile utilization, cuts idle periods by 65.5 percent, adds less than 0.002 percent to execution time, and doubles the rate at which low-criticality tasks complete.

What carries the argument

Dynamic task dispatching infrastructure that performs runtime context switching and data copying to reassign AIE tiles to tasks of varying criticality.

If this is right

  • Mixed-criticality designs can now exploit the full parallel capacity of the AIE array instead of reserving tiles statically.
  • The measured overhead bounds allow the dispatcher to be used without violating hard real-time deadlines for high-criticality tasks.
  • Low-criticality tasks benefit from higher throughput when high-criticality load decreases.
  • Overall AIE utilization rises, reducing the hardware resources needed for a given workload set.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar dynamic mechanisms could be applied to other heterogeneous accelerators in safety-critical domains.
  • The dispatcher might be extended to support more than two criticality levels or finer-grained resource allocation.
  • This work suggests that hardware vendors could add explicit support for runtime task migration in future AI engine designs.

Load-bearing premise

The dispatch overhead stays low and predictable even when the system switches criticality modes frequently or under varying data volumes.

What would settle it

An experiment that records dispatcher latency distributions during repeated mode switches with maximum data-copy sizes and checks whether any high-criticality task misses its deadline.

Figures

Figures reproduced from arXiv: 2604.21124 by Alberto Garcia-Ortiz, Daniele Passaretti, Martin Wilhelm, Thilo Pionteck, Vincent Sprave.

Figure 1
Figure 1. Figure 1: AIE array microarchitecture [9]. Applications for the AIE are expressed as dataflow graphs, where nodes, called ’kernels’ within AMD environment, are mapped onto individual tiles. The graph topology must be fixed at compile time, with the Versal PS managing execution at runtime. While the PS can manage the data flow into the [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Infrastructure for dynamic task dispatching. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Task queue implementation. The task queue implementation is shown in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Timing diagram to illustrate the context switch. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Criticality-aware resource allocation strategy flowchart. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Measured runtime of the criticality-aware resource [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Time measurements of copy time from input buffer to [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 11
Figure 11. Figure 11: Execution time distribution of static mapping and the [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 10
Figure 10. Figure 10: Dynamic dispatching infrastructure with particle filter [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗
read the original abstract

Adaptive Systems-on-Chips (SoCs) are increasingly being used in mixed criticality systems (MCSs), such as in autonomous driving, aviation and medical systems. In this context, AMD has proposed the Versal SoC, which has a heterogeneous architecture including, among other components, an Artificial Intelligence Engine (AIE), which is a 2D array of processors and memory tiles designed for AI and signal processing workloads. While this AIE offers significant potential for accelerating real-time data processing tasks, this has not yet been explored in the context of MCSs since individual tasks with different criticality levels cannot be dynamically assigned to tiles due to the static mapping of dataflow graphs and tasks. In this work, we propose a dynamic task dispatching infrastructure that enables task switching on the AIE at runtime. Based on this infrastructure, we present an MCS design that dynamically assigns tasks of different criticality to a pool of AIE tiles, depending on the criticality mode of the system. Our approach overcomes the limitations of static dataflow graph mappings and, for the first time, exploits the parallel processing capabilities of the AIE for MCSs. We also present a comprehensive timing analysis of the overhead introduced by the task dispatcher infrastructure, focusing on control logic, context switching and data copy operations. This shows that these operations have low variance and are negligible compared to the overall execution time, demonstrating that our infrastructure is suitable for MCSs. Finally, we evaluate the proposed infrastructure using an autonomous driving workload with tasks that have variable execution times and different criticality levels. In this case study, we maximized AIE utilization, reducing idle time by 65.5 %, while measuring an execution time overhead of less than 0.002 %, and doubling the throughput of low-criticality tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes a dynamic task dispatching infrastructure for the AMD Versal AI-Engine (AIE) to enable mixed-criticality systems (MCS) by allowing runtime assignment of tasks with different criticality levels to AIE tiles, overcoming the limitations of static dataflow graph mappings. It presents a timing analysis of the dispatcher overhead (control logic, context switching, and data copies) claiming low variance and negligible impact, and evaluates the approach on an autonomous driving workload, reporting a 65.5% reduction in idle time, less than 0.002% execution time overhead, and doubled throughput for low-criticality tasks.

Significance. If the overhead claims hold under full MCS operating conditions, this would be a meaningful contribution to real-time embedded systems by extending the parallel capabilities of the AIE array to safety-critical applications such as autonomous driving and aviation. The concrete utilization and throughput gains in the case study provide practical evidence of the approach's potential benefits.

major comments (1)
  1. The timing analysis section claims that dispatcher operations (control logic, context switching, data copies) have low variance and are negligible compared to overall execution time, supporting suitability for MCS. However, the analysis is described as covering these components in isolation. The central claim that the infrastructure preserves hard real-time guarantees requires demonstrating that overhead remains low-variance and non-interfering during system mode switches when high-criticality tasks are active and under concurrent tile usage. The single autonomous-driving workload evaluation does not enumerate worst-case interleavings or prove bounded variance in such dynamic scenarios.
minor comments (1)
  1. The evaluation section would benefit from additional details on the specific tasks, their execution time variability, criticality levels, and the exact mechanism for dynamic assignment to enable better assessment of the reported throughput doubling.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment point by point below.

read point-by-point responses
  1. Referee: The timing analysis section claims that dispatcher operations (control logic, context switching, data copies) have low variance and are negligible compared to overall execution time, supporting suitability for MCS. However, the analysis is described as covering these components in isolation. The central claim that the infrastructure preserves hard real-time guarantees requires demonstrating that overhead remains low-variance and non-interfering during system mode switches when high-criticality tasks are active and under concurrent tile usage. The single autonomous-driving workload evaluation does not enumerate worst-case interleavings or prove bounded variance in such dynamic scenarios.

    Authors: We agree that the timing analysis presents the overhead components in isolation and that the single workload, while including tasks with variable execution times and mixed criticality, does not explicitly enumerate worst-case interleavings or measure overhead during mode switches with high-criticality tasks active under concurrent tile usage. This is a valid observation. In the revised version we will add a dedicated subsection with new measurements of dispatcher overhead (control logic, context switching, and data copies) captured during explicit system mode switches, with high-criticality tasks executing and multiple tiles operating concurrently. These measurements will report the observed variance and discuss non-interference properties under those conditions, thereby strengthening the evidence that hard real-time guarantees are preserved. revision: yes

Circularity Check

0 steps flagged

No circularity; engineering implementation validated by direct measurements

full rationale

The paper presents a runtime task dispatcher for Versal AIE tiles in mixed-criticality settings. All quantitative claims (65.5% idle-time reduction, <0.002% overhead, doubled low-criticality throughput) are obtained from concrete workload execution on an autonomous-driving case study. No equations, fitted parameters, or derivations appear; the timing analysis is an empirical measurement of control logic, context switches, and data copies rather than a model that reduces to its own inputs. No self-citations, ansatzes, or uniqueness theorems are invoked as load-bearing steps. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the feasibility of low-overhead runtime context switching on the AIE hardware and on the assumption that measured dispatcher times remain representative across all criticality modes.

axioms (1)
  • domain assumption The Versal AIE hardware and its programming model support dynamic task switching with bounded and low-variance latency for control, context save/restore, and data movement.
    Invoked to justify that the dispatcher is suitable for MCS timing requirements.
invented entities (1)
  • Dynamic task dispatching infrastructure no independent evidence
    purpose: Enable runtime assignment of tasks with different criticality levels to a shared pool of AIE tiles based on system mode.
    New control layer proposed to overcome static dataflow mapping; no independent evidence outside the paper's measurements is provided.

pith-pipeline@v0.9.0 · 5634 in / 1461 out tokens · 46563 ms · 2026-05-09T22:25:19.227167+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 4 canonical work pages

  1. [1]

    Asynchronous Buffer Port Access •

  2. [2]

    CoRR , volume =

    An Zou and Yuankai Xu and Yinchen Ni and Jintao Chen and Yehan Ma and Jing Li and Christopher Gill and Xuan Zhang and Yier Jin , title =. CoRR , volume =. 2025 , eprint =

  3. [3]

    Re-Thinking Mixed-Criticality Architecture for Automotive Industry , year=

    Jiang, Zhe and Zhao, Shuai and Dong, Pan and Yang, Dawei and Wei, Ran and Guan, Nan and Audsley, Neil , booktitle=. Re-Thinking Mixed-Criticality Architecture for Automotive Industry , year=

  4. [4]

    Heterogeneous MPSoCs for Mixed-Criticality Systems: Challenges and Opportunities , year=

    Hassan, Mohamed , journal=. Heterogeneous MPSoCs for Mixed-Criticality Systems: Challenges and Opportunities , year=

  5. [5]

    31st Euromicro Conference on Real-Time Systems (ECRTS 2019) , pages=

    Designing mixed criticality applications on modern heterogeneous mpsoc platforms , author=. 31st Euromicro Conference on Real-Time Systems (ECRTS 2019) , pages=

  6. [6]

    Run-Time

    Gupta, Aman and Ahmed, Sagheer and Kumar Jain, Abhishek and Arbel, Ygal and Morshed, Abbas and Schultz, David , year =. Run-Time. 2020 13th

  7. [7]

    , title =

    Burns, Alan and Davis, Robert I. , title =. ACM Computing Surveys , volume =. 2017 , month =. doi:10.1145/3131347 , publisher =

  8. [8]

    arXiv preprint arXiv:2505.11970 , year =

    Zou, An and others , title =. arXiv preprint arXiv:2505.11970 , year =

  9. [9]

    Future Generation Computer Systems , volume =

    Cinque, Marcello and Cotroneo, Domenico and De Simone, Luigi and Rosiello, Stefano , title =. Future Generation Computer Systems , volume =. 2022 , doi =

  10. [10]

    Shedding Light on Static Partitioning Hypervisors for

    Martins, Jos. Shedding Light on Static Partitioning Hypervisors for. Proceedings of the 29th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) , pages =. 2023 , address =. doi:10.1109/RTAS58335.2023.00011 , publisher =

  11. [12]

    36th Euromicro Conference on Real-Time Systems (ECRTS 2024) , doi =

    Ottaviano, Daniele and Ciraolo, Francesco and Mancuso, Renato and Cinque, Marcello , title =. 36th Euromicro Conference on Real-Time Systems (ECRTS 2024) , doi =

  12. [13]

    Journal of Systems Architecture , volume =

    Xia, Tian and Tian, Ye and Pr. Journal of Systems Architecture , volume =. 2019 , doi =

  13. [14]

    Virtualization of Reconfigurable Mixed-Criticality Systems , booktitle =

    Wulf, Cornelia and G. Virtualization of Reconfigurable Mixed-Criticality Systems , booktitle =. 2022 , publisher =

  14. [15]

    Enabling Partial Reconfiguration for Coprocessors in Mixed Criticality Multicore Systems Using

    Vu, Duc Viet and Sander, Oliver and Sandmann, Tobias and Baehr, Stefan and Heidelberger, Johann and Becker, J. Enabling Partial Reconfiguration for Coprocessors in Mixed Criticality Multicore Systems Using. 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14) , year =

  15. [16]

    IEEE Design & Test , volume=

    Mixed criticality systems—a history of misconceptions? , author=. IEEE Design & Test , volume=

  16. [17]

    Criticality-driven Design Space Exploration for Mixed-Criticality Heterogeneous Parallel Embedded Systems , booktitle=

    Muttillo, Vittoriano and Valente, Giacomo and Pomante, Luigi , year=. Criticality-driven Design Space Exploration for Mixed-Criticality Heterogeneous Parallel Embedded Systems , booktitle=

  17. [18]

    Towards a type 0 hypervisor for dynamic reconfigurable systems , booktitle=

    Janßen, Benedikt and Korkmaz, Fatih and Derya, Halil and Hübner, Michael and Ferreira, Mário Lopes and Ferreira, João Canas , year=. Towards a type 0 hypervisor for dynamic reconfigurable systems , booktitle=

  18. [19]

    Towards Mixed-Criticality Software Architectures for Centralized

    Mauser, Lucas and Zimmermann, Eva and Nedvědický, Pavel and Eisenreich, Tobias and Wäschle, Moritz and Wagner, Stefan , year=. Towards Mixed-Criticality Software Architectures for Centralized. 2506.05822 , eprinttype=

  19. [20]

    Virtualization of Reconfigurable Mixed-Criticality Systems , booktitle=

    Wulf, Cornelia and Charaf, Najdet and Göhringer, Diana , year=. Virtualization of Reconfigurable Mixed-Criticality Systems , booktitle=

  20. [21]

    Preemptive Scheduling of Multi-criticality Systems with Varying Degrees of Execution Time Assurance , booktitle=

    Vestal, Steve , year=. Preemptive Scheduling of Multi-criticality Systems with Varying Degrees of Execution Time Assurance , booktitle=