arxiv: 2604.09631 · v1 · submitted 2026-03-19 · 💻 cs.DC · cs.AI

Recognition: no theorem link

Hardware Utilization and Inference Performance of Edge Object Detection Under Fault Injection

Faezeh Pasandideh , Mehdi Azarafza , Achim Rettberg

Authors on Pith no claims yet

Pith reviewed 2026-05-15 07:58 UTC · model grok-4.3

classification 💻 cs.DC cs.AI

keywords edge computingobject detectionfault injectionhardware utilizationTensorRTJetson Nanoautonomous drivinginference performance

0 comments

The pith

TensorRT-optimized YOLO models on Jetson Nano keep GPU occupancy, temperature, power, and memory stable even under heavy input degradation from injected faults.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures CPU load, GPU utilization, RAM consumption, power draw, throughput, and thermal behavior of TensorRT-optimized YOLOv10s, YOLOv11s, and YOLO2026n models on an NVIDIA Jetson Nano. It performs this characterization under a large-scale fault injection campaign for both lane-following and object detection tasks, with faults generated by large language models and latent diffusion models from original JetBot platform data. The results show that GPU occupancy stays stable, temperature rise remains controlled, power consumption stays within safe limits, and memory usage settles into a consistent release pattern after warm-up, though object detection exhibits somewhat more variability in memory and thermal metrics. A sympathetic reader would care because autonomous driving systems need reliable edge inference hardware behavior when inputs degrade, and this supplies concrete data showing the pipelines remain robust rather than failing at the hardware level.

Core claim

The central claim is that across both tasks and both models the inference engines keep GPU occupancy stable, temperature rise under control, and power consumption within safe limits, while memory usage settles into a consistent release pattern after the initial warm-up phase. Object detection tends to show somewhat more variability in memory and thermal behavior, yet both tasks point to the same conclusion: the TensorRT pipelines hold up well even when the input data is heavily degraded.

What carries the argument

The decoupled fault injection framework that leverages large language models and latent diffusion models to synthesize degraded inputs from JetBot platform data, combined with hardware monitoring of TensorRT-optimized YOLO models on Jetson Nano.

If this is right

Stable GPU occupancy supports predictable throughput without sudden drops during operation on edge devices.
Controlled temperature and power draw reduce the risk of thermal throttling or excessive battery drain in mobile autonomous systems.
Consistent memory release patterns after warm-up enable reliable long-running inference without accumulating resource leaks.
Object detection showing more variability than lane-following still stays within safe hardware bounds, suggesting the approach tolerates task-specific differences.
The findings provide a hardware-level reliability view that can sit alongside performance benchmarks for edge deployment decisions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Developers of autonomous edge systems might focus more on input robustness testing than on adding hardware margins for fault tolerance.
The monitoring approach could be extended to other edge platforms or models to create comparative reliability profiles across hardware choices.
In deployed vehicles, pairing this kind of metric tracking with runtime input quality checks could enable graceful degradation strategies before hardware limits are reached.

Load-bearing premise

The faults synthesized using LLMs and LDMs based on JetBot platform data accurately represent the kinds of real-world input degradation that would occur in deployed autonomous driving systems.

What would settle it

Real-world sensor data with actual degradations (such as camera noise or weather effects) producing GPU occupancy spikes, uncontrolled temperature rises, or power draws outside safe limits while the synthetic faults do not would falsify the claim that the pipelines hold up well.

Figures

Figures reproduced from arXiv: 2604.09631 by Achim Rettberg, Faezeh Pasandideh, Mehdi Azarafza.

**Figure 1.** Figure 1: Two-phase pipeline: offline synthetic fault generation via LLM-LDM; online hardware characterization under fault injection on Jetson Nano. tight thresholds producing up to 30% false positives in faultfree runs, while memory contention consistently emerged as the dominantstressor driving timeout misclassifications across both platforms. However,their work focuses exclusively on timeoutbased failure detect… view at source ↗

**Figure 3.** Figure 3: Normal and synthetically generated faulty image sam [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: (b) Power Consumption (∆Power) – YOLO2026v2 (lane following) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 7.** Figure 7: (d) Power Consumption (∆Power) – YOLO2026v2 (object detection). operational band, confirming that the TensorRT engine maintains a stable power profile even under extended, high-volume load. The absolute mean total power consumption of 5.19 W reflects efficient hardware utilization throughout the run. For the YOLO2026v2 scenarios, the data distributions follow a tight curve in both lane following and objec… view at source ↗

**Figure 10.** Figure 10: (c) Memory Allocation (∆RAM) – YOLOv11s (object detection) [PITH_FULL_IMAGE:figures/full_fig_p006_10.png] view at source ↗

**Figure 14.** Figure 14: (c) Thermal Flux (∆Temp) – YOLOv11s (object detection). (∆Temp) – YOLO2026v2 (lane following) [PITH_FULL_IMAGE:figures/full_fig_p006_14.png] view at source ↗

**Figure 16.** Figure 16: (a) Core Utilization (GPU%) – YOLOv11s (lane following) [PITH_FULL_IMAGE:figures/full_fig_p007_16.png] view at source ↗

**Figure 18.** Figure 18: (c) Core Utilization (GPU%) – YOLOv11s (object detection) [PITH_FULL_IMAGE:figures/full_fig_p007_18.png] view at source ↗

**Figure 20.** Figure 20: (a) Detection Stability (Retention%) – YOLOv11s (object detection) [PITH_FULL_IMAGE:figures/full_fig_p008_20.png] view at source ↗

**Figure 22.** Figure 22: (a) Detection Stability (Retention%) – YOLOv11s (lane following) [PITH_FULL_IMAGE:figures/full_fig_p008_22.png] view at source ↗

**Figure 29.** Figure 29: (b) Tail Latency – YOLO2026v2 (object detection) [PITH_FULL_IMAGE:figures/full_fig_p009_29.png] view at source ↗

**Figure 31.** Figure 31: (b) Tail Latency – YOLO2026v2 (lane following). Tables I–IV summarise the mean values of all hardware resource and inference performance metrics across the three TensorRT-optimized models evaluated on the NVIDIA Jetson Nano under fault-injected load, reported separately for the object detection and lane-following tasks. For the object detection task, the comparison across all three models reveals several… view at source ↗

**Figure 33.** Figure 33: (b) Execution Consistency (∆Jitter) – YOLO2026v2 (object detection) [PITH_FULL_IMAGE:figures/full_fig_p011_33.png] view at source ↗

**Figure 35.** Figure 35: (d) Execution Consistency (∆Jitter) – YOLO2026v2 (lane following). REFERENCES [1] F. Pasandideh, M. Azarafza, A. Ehteshami Bejnordi, S. Henkler, and A. Rettberg, “Decoupled generative fault injection for autonomous robots via LLM-LDM,” in Proceedings of the AI Technology Conference (AITC), 2026, poster presentation. [2] M. Pourreza and P. Narasimhan, “When timeouts fail: Revisiting fault detection under r… view at source ↗

read the original abstract

As deep learning models are deployed on resource constrained edge platforms in autonomous driving systems, reli able knowledge of hardware behavior under resource degradation becomes an essential requirement. Therefore, we introduce a systematic characterization of CPU load, GPU utilization, RAM consumption, power draw, throughput, and thermal behaviour of TensorRT-optimized YOLOv10s, YOLOv11s and YOLO2026n pipelines running on NVIDIA Jetson Nano under a large-scale fault injection campaign targeting both lane-following and ob ject detection tasks. Faults are synthesized using a decoupled framework that leverages large language models (LLMs) and latent diffusion models (LDMs), based on original data from our JetBot platform data collection. Results show that across both tasks and both models the inference engines keep GPU occupancy stable, temperature rise under control, and power consumption within safe limits, while memory usage settles into a consistent release pattern after the initial warm-up phase. Object detection tends to show somewhat more variability in memory and thermal behavior, yet both tasks point to the same conclusion: the TensorRT pipelines hold up well even when the input data is heavily degraded. These findings offer a hardware-level view of model reliability that sits alongside, rather than against, the broader body of work focused on inference performance at the edge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports stable hardware metrics for TensorRT YOLO on Jetson Nano under LLM/LDM-synthesized faults, but the faults lack shown validation against real sensor degradations.

read the letter

The main thing to know is that this paper finds stable GPU occupancy, controlled temperature rise, and safe power levels for TensorRT-optimized YOLO models on Jetson Nano even when inputs are heavily degraded by their fault injection method. The stability holds for both object detection and lane-following tasks, with object detection showing a bit more variability in memory and heat. What is new is the use of LLMs and latent diffusion models to synthesize the faults, drawing from their own JetBot platform data. They apply this to YOLOv10s, YOLOv11s, and YOLO2026n, tracking CPU load, GPU utilization, RAM, power, throughput, and thermal behavior in a large campaign. The work does well by providing a hardware-level view that complements performance-focused studies, and by noting consistent memory release patterns after warm-up. The soft spots center on validation and evidence presentation. The abstract gives the conclusions but no quantitative results or tables, so it's difficult to assess how strong the stability really is without the full figures. The bigger issue is whether the synthetic faults accurately represent real sensor or camera degradations in autonomous driving. The paper does not appear to include side-by-side comparisons or similarity metrics to actual failures on the Jetson Nano, which leaves open the possibility that the observed robustness is tied to the particular fault types generated rather than a general property. That said, the approach is empirical with no fitting loops or invented parameters, so the measurements themselves seem straightforward. This paper is aimed at practitioners and researchers focused on edge deployment reliability for autonomous systems. A reader looking for concrete hardware data under degraded conditions would find it useful, provided the full text has the supporting data. It deserves peer review because the topic is practical and the campaign is broad, though referees will likely want more on fault realism and the actual numbers. Recommendation: send it to review.

Referee Report

1 major / 2 minor

Summary. The paper conducts a large-scale empirical characterization of hardware metrics (CPU load, GPU utilization, RAM, power, throughput, temperature) for TensorRT-optimized YOLOv10s, YOLOv11s, and YOLO2026n models on NVIDIA Jetson Nano. It targets lane-following and object detection tasks under faults synthesized via LLMs and LDMs conditioned on JetBot platform data. The central claim is that GPU occupancy stays stable, temperature and power remain bounded, and memory exhibits consistent release patterns after warm-up, even with heavily degraded inputs, though object detection shows more variability.

Significance. If the synthetic fault model proves representative, the work supplies useful hardware-level reliability data for edge AI in autonomous systems, complementing accuracy-focused studies. The systematic campaign across models and tasks is a positive contribution, but the absence of validation for the fault synthesis method limits its impact and generalizability.

major comments (1)

[Fault synthesis section] Fault synthesis section: The manuscript provides no quantitative validation (e.g., distributional metrics, perceptual similarity scores, or side-by-side hardware trace comparisons) of the LLM/LDM-generated faults against actual camera, lighting, or sensor degradations recorded on the same Jetson Nano/JetBot platform. This is load-bearing for the stability claims, as the observed robustness could be an artifact of the particular synthetic distribution rather than a general property of the TensorRT pipelines.

minor comments (2)

[Abstract] Abstract contains minor typographical issues (e.g., 'reli able' and 'ob ject') that should be corrected for clarity.
[Abstract and Results] The abstract states conclusions from a large-scale campaign but provides no quantitative results, error bars, or data tables; the full paper should include at least summary statistics or key figures in the results section to support the claims.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback and for recognizing the value of the systematic hardware characterization across models and tasks. We address the major comment on fault synthesis validation below.

read point-by-point responses

Referee: [Fault synthesis section] Fault synthesis section: The manuscript provides no quantitative validation (e.g., distributional metrics, perceptual similarity scores, or side-by-side hardware trace comparisons) of the LLM/LDM-generated faults against actual camera, lighting, or sensor degradations recorded on the same Jetson Nano/JetBot platform. This is load-bearing for the stability claims, as the observed robustness could be an artifact of the particular synthetic distribution rather than a general property of the TensorRT pipelines.

Authors: We agree that the lack of quantitative validation for the synthetic faults is a limitation that affects the strength of the generalizability claims. The synthesis framework conditions LLMs and LDMs on real JetBot-collected data, but our campaign did not include recording of paired real-world degraded inputs (e.g., actual camera or sensor faults) for direct comparison. In revision we will: (1) expand the fault synthesis section with distributional statistics (mean/variance of pixel intensity, edge density, and brightness) comparing original vs. synthetic inputs, plus a simple perceptual metric such as average LPIPS distance; (2) add an explicit limitations subsection stating that all robustness conclusions are conditioned on this synthetic distribution and that real-fault validation is required for broader claims. We cannot supply side-by-side hardware traces against actual recorded degradations because such paired data was never collected. revision: partial

standing simulated objections not resolved

Direct quantitative validation (distributional metrics or hardware traces) against actually recorded real-world camera/lighting/sensor degradations on the JetBot platform, as no such paired real-fault recordings were made during the original data collection.

Circularity Check

0 steps flagged

No circularity: purely empirical hardware measurements

full rationale

The manuscript reports direct experimental observations of GPU occupancy, temperature, power, memory, and throughput on Jetson Nano under LLM/LDM-synthesized faults for YOLO pipelines. No equations, fitted parameters, predictions, or derivations appear in the provided text. The fault synthesis step is described as input generation from collected JetBot data rather than a self-referential loop. Central claims rest on measured stability metrics, not on any reduction to prior outputs or self-citations. This is a standard empirical characterization study with no load-bearing self-referential structure.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is empirical and does not rely on unstated mathematical axioms or new postulated entities; it assumes standard hardware monitoring APIs and that the chosen fault synthesis produces representative degradations.

pith-pipeline@v0.9.0 · 5537 in / 1045 out tokens · 22793 ms · 2026-05-15T07:58:44.899781+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 2 internal anchors

[1]

Decoupled generative fault injection for autonomous robots via LLM-LDM,

F. Pasandideh, M. Azarafza, A. Ehteshami Bejnordi, S. Henkler, and A. Rettberg, “Decoupled generative fault injection for autonomous robots via LLM-LDM,” in Proceedings of the AI Technology Conference (AITC), 2026, poster presentation

work page 2026
[2]

When timeouts fail: Revisiting fault detection under resource stress in edge computing,

M. Pourreza and P. Narasimhan, “When timeouts fail: Revisiting fault detection under resource stress in edge computing,” in Proceedings of the 18th IEEE/ACM International Conference on Utility and Cloud Computing, ser. UCC ’25. New York, NY, USA: Association for Computing Machinery, 2026. [Online]. Available: https://doi.org/10. 1145/3773274.3774280

work page arXiv 2026
[3]

Improving performance of real-time object detection in edge device through concurrent multi -frame processing,

S. Kim, C. Kim, and S. Kim, “Improving performance of real-time object detection in edge device through concurrent multi -frame processing,” IEEE Access, vol. 13, pp. 1522–1533, 2025

work page 2025
[4]

Benchmarking yolov8 variants for object detection efficiency on jetson orin nx for edge computing applications,

H. M. Aljami, N. A. Alrowais, A. M. AlAwajy, S. O. Alhrgan, R. A. Aldwaani, M. S. Alsawadi, N. U. Saqib, S. S. Alam, and R. Alsubaie, “Benchmarking yolov8 variants for object detection efficiency on jetson orin nx for edge computing applications,” Computers, vol. 15, no. 2,

work page
[5]

Available: https://www.mdpi.com/2073-431X/15/2/74

[Online]. Available: https://www.mdpi.com/2073-431X/15/2/74

work page 2073
[6]

An edge -deployed real-time adaptive traffic light control system using yolo -based vehicle detection and pce -aware density estimation,

M. Raza, M. Kazmi, H. M. Kidwai, H. R. Khan, S. A. Qazi, K. Ar- shad, and K. Assaleh, “An edge -deployed real-time adaptive traffic light control system using yolo -based vehicle detection and pce -aware density estimation,” IEEE Access, vol. 13, pp. 153 586–153 613, 2025

work page 2025
[7]

NVIDIA Jetson Nano,

NVIDIA Developer, “NVIDIA Jetson Nano,” https://developer.nvidia. com/embedded/jetson-nano, accessed: 2 Sep. 2025

work page 2025
[8]

Logitech C270 HD webcam, 720p video with noise reduc- ing mic,

Logitech, “Logitech C270 HD webcam, 720p video with noise reduc- ing mic,” https://www.logitech.com/en-us/shop/p/c270-hd-webcam, ac- cessed: 2 Sep. 2025

work page 2025
[9]

NVIDIA T4 tensor core GPUs for accelerating AI in- ference,

NVIDIA, “NVIDIA T4 tensor core GPUs for accelerating AI in- ference,” https://www.nvidia.com/en-us/data-center/tesla-t4/, accessed: 2 Sep. 2025

work page 2025
[10]

Train and deploy YOLO models,

EdjeElectronics, “Train and deploy YOLO models,” https://github. com/EdjeElectronics/Train-and-Deploy-YOLO-Models, 2023, accessed: 2 Sep. 2025

work page 2023
[11]

PyTorch 2: Faster machine learning through dynamic Python bytecode transformation and graph compilation,

J. Ansel et al. , “PyTorch 2: Faster machine learning through dynamic Python bytecode transformation and graph compilation,” in Proceedings of the ACM International Conference on Architectural Support for Pro - gramming Languages and Operating Systems, 2024, pp. 929–947

work page 2024
[12]

Comparative performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN models for detection of multiple weed species,

A. Sharma, V. Kumar, and L. Longchamps, “Comparative performance of YOLOv8, YOLOv9, YOLOv10, YOLOv11 and Faster R-CNN models for detection of multiple weed species,” Smart Agricultural Technology, vol. 9, p. 100648, Nov 2024. For the lane -following task, the models demonstrate stable hardware resource utilization, with YOLO2026v2 recording a mean power d...

work page 2024
[13]

ONNX runtime,

ONNX Runtime Developers, “ONNX runtime,” https://onnxruntime.ai/, 2021, accessed: 2 Sep. 2025

work page 2021
[14]

TensorRT -based framework and optimiza- tion methodology for deep learning inference on Jetson boards,

E. Jeong, J. Kim, and S. Ha, “TensorRT -based framework and optimiza- tion methodology for deep learning inference on Jetson boards,” ACM Transactions on Embedded Computing Systems , vol. 21, no. 5, pp. 1 –26, Jan 2022

work page 2022
[15]

gpt-oss-120b & gpt-oss-20b Model Card

OpenAI, “GPT-OSS-120B & GPT-OSS-20B model card,” arXiv preprint arXiv:2508.10925, Aug. 2025. [Online]. Available: https: //arxiv.org/abs/2508.10925

work page internal anchor Pith review Pith/arXiv arXiv 2025
[16]

[Online]

Ollama Contributors, “Ollama,” GitHub repository, 2023. [Online]. Available: https://github.com/ollama/ollama

work page 2023
[17]

High-Resolution Image Synthesis with Latent Diffusion Models

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, Jun. 2022, pp. 10 684–10 695. [Online]. Available: https://arxiv.org/abs/2112.10752

work page internal anchor Pith review Pith/arXiv arXiv 2022
[18]

Visionfault -350k: A large -scale fault injection dataset for robotic vision systems,

M. Azarafza and F. Pasandideh, “Visionfault -350k: A large -scale fault injection dataset for robotic vision systems,” Feb. 2026. [Online]. Available: https://doi.org/10.5281/zenodo.18695332

work page doi:10.5281/zenodo.18695332 2026