pith. sign in

arxiv: 2604.18520 · v1 · submitted 2026-04-20 · 📡 eess.SP

Joint Scheduling of Multi-Band Radar Sensing and DNN Inference for Cross-Stage Parallelism

Pith reviewed 2026-05-10 03:34 UTC · model grok-4.3

classification 📡 eess.SP
keywords multi-band radarDNN inferencecross-stage parallelismjoint schedulingend-to-end latencyDAG schedulingrelease-aware heuristicmulti-core execution
0
0 comments X

The pith

By releasing DNN inference branches as soon as each radar band finishes sensing, joint scheduling reduces end-to-end latency versus waiting for all bands to complete.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a scheduling approach that minimizes latency in pipelines combining multi-band radar sensing with DNN inference. It moves beyond sequential stage designs by letting each inference branch start immediately after its band is sensed, rather than holding all branches until the full sensing phase ends. This interaction is captured in a joint optimization problem that allocates sensing times while respecting precedence constraints and multi-core capacity limits on the inference task graph. A release-aware heuristic paired with a greedy list scheduler solves the combinatorial problem efficiently. Simulations indicate the method lowers latency relative to a decoupled baseline across many heterogeneous sensing conditions.

Core claim

The central claim is that cross-stage parallelism, achieved by coupling sensing-time allocation with branch release times and non-preemptive multi-core DAG execution under sensing-feasibility, precedence, and core-capacity constraints, yields lower end-to-end latency than decoupled baselines when sensing requirements differ across bands.

What carries the argument

A release-aware heuristic that scores each sensing decision by its effect on downstream DAG makespan, combined with greedy list scheduling for multi-core execution of released branches.

If this is right

  • Lower end-to-end latency in heterogeneous multi-band sensing scenarios
  • Clear identification of regimes where the latency gain shrinks or vanishes
  • More efficient use of cross-stage overlap between radar sensing and DNN processing
  • Improved makespan compared with stage-wise decoupled scheduling

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same release-upon-completion principle could apply to other partial-data pipelines such as multi-camera vision or multi-modal fusion
  • Hardware designs that signal sensing completion to the inference scheduler might amplify the observed gains
  • Channel fading or shared spectrum constraints could shrink the modeled independence of band sensing times

Load-bearing premise

Sensing durations for each band can be chosen continuously and independently while the DNN task graph allows fully independent branch execution once a band completes, without interference from unmodeled hardware or channel effects.

What would settle it

An experiment on real radar hardware that measures end-to-end latency under actual band interference or fixed core contention and finds no reduction relative to the decoupled baseline.

Figures

Figures reproduced from arXiv: 2604.18520 by Kezhi Wang, Sai Xu, Yanan Du, Yansha Deng.

Figure 1
Figure 1. Figure 1: System architecture of latency-aware joint scheduling for multi-band [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A random DAG for the simulation. gk(γk(t)) = Bk log2 [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of end-to-end execution timelines under the proposed joint scheduling method and the decoupled baseline. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The effects of three system-scaling dimensions—(a) the number of accelerator cores, (b) the bandwidth (kHz), and (c) the SINR threshold (dB). [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
read the original abstract

This paper studies end-to-end latency minimization for a multi-band radar sensing and deep neural network (DNN) inference pipeline. Unlike conventional stage-wise designs that treat radar sensing and DNN inference as two sequential stages, the proposed framework exploits cross-stage parallelism by allowing the inference branch associated with a sensed band to start as soon as that band completes sensing, without waiting for all bands to finish. To characterize this interaction, we formulate a joint scheduling problem that couples sensing-time allocation, branch release timing, and non-preemptive multi-core execution of a directed acyclic graph (DAG) under sensing-feasibility, precedence, and core-capacity constraints. Since the resulting problem is combinatorial and strongly time-coupled, we further develop a release-aware heuristic that evaluates each sensing decision according to its downstream impact on the DAG makespan, together with a greedy list scheduler for multi-core DAG execution under release times. Simulation results show that the proposed design can effectively exploit cross-stage parallelism and reduce end-to-end latency relative to a decoupled baseline in many heterogeneous sensing scenarios, while also clarifying the operating regimes in which the latency gain becomes limited.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. This paper studies end-to-end latency minimization for a multi-band radar sensing and DNN inference pipeline. It formulates a joint scheduling problem that couples sensing-time allocation, branch release timing, and non-preemptive multi-core DAG execution under sensing-feasibility, precedence, and core-capacity constraints, allowing inference branches to start upon individual band sensing completion. A release-aware heuristic evaluates sensing decisions by downstream DAG makespan impact, paired with a greedy list scheduler. Simulations indicate the approach exploits cross-stage parallelism to reduce latency versus a decoupled baseline in heterogeneous scenarios.

Significance. If the results hold, the work could be significant for real-time integrated sensing-computing systems in applications like autonomous perception, by providing a practical heuristic for a combinatorial cross-stage scheduling problem formulated from first principles of precedence and resource limits. The absence of circularity in the objective and the focus on release-aware decisions are positive aspects, but the reliance on unspecified simulations without validation against optima or statistical rigor limits immediate applicability.

major comments (2)
  1. [Abstract] Abstract: the performance claims rest on 'simulation results' whose setup, number of trials, error bars, statistical tests, or comparisons to optimal solvers are not described. This is load-bearing for the central claim that the heuristic 'can effectively exploit cross-stage parallelism and reduce end-to-end latency,' as the gains cannot be assessed for robustness or significance.
  2. [Problem formulation] Problem formulation (as described in the abstract): sensing durations are treated as continuous independent decision variables per band with fully decoupled branch execution into the DAG once released. This decoupling enables the reported makespan reductions, but if practical multi-band radar constraints (discrete PRI multiples, shared front-end hardware, or channel-dependent integration times) are unmodeled, the release-aware heuristic calculations and latency gains become invalid.
minor comments (1)
  1. [Abstract] Abstract: the description of 'many heterogeneous sensing scenarios' is vague; specifying the range of band counts, sensing time distributions, or DAG structures would improve clarity without altering the claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps improve the clarity and rigor of our work on joint scheduling for cross-stage parallelism in multi-band radar and DNN pipelines. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the performance claims rest on 'simulation results' whose setup, number of trials, error bars, statistical tests, or comparisons to optimal solvers are not described. This is load-bearing for the central claim that the heuristic 'can effectively exploit cross-stage parallelism and reduce end-to-end latency,' as the gains cannot be assessed for robustness or significance.

    Authors: We agree that additional details are required to allow readers to assess robustness. In the revised manuscript, we will expand the simulation section (and abstract if space permits) to specify the full setup, including parameter ranges, number of Monte Carlo trials (typically 500–1000 independent runs), error bars or confidence intervals on latency plots, and any statistical significance tests performed. For comparisons to optima, we will add small-scale experiments using an ILP formulation solved via a commercial solver (e.g., Gurobi) on instances with few bands and tasks, reporting optimality gaps of the heuristic. These additions will be placed in a new subsection on evaluation methodology. revision: yes

  2. Referee: [Problem formulation] Problem formulation (as described in the abstract): sensing durations are treated as continuous independent decision variables per band with fully decoupled branch execution into the DAG once released. This decoupling enables the reported makespan reductions, but if practical multi-band radar constraints (discrete PRI multiples, shared front-end hardware, or channel-dependent integration times) are unmodeled, the release-aware heuristic calculations and latency gains become invalid.

    Authors: The formulation intentionally models sensing durations as continuous decision variables under the stated sensing-feasibility and precedence constraints to isolate and analyze the cross-stage parallelism mechanism. This is a deliberate abstraction that enables tractable joint optimization of release times and DAG scheduling. We acknowledge that real systems impose discrete PRI multiples, shared RF hardware, and channel-dependent integration times. In the revision we will (i) explicitly list these modeling assumptions in Section II, (ii) add a dedicated paragraph in the discussion section explaining how the framework can be extended (e.g., by discretizing the sensing-time variables or adding mutual-exclusion constraints on front-end resources), and (iii) note that the release-aware heuristic remains applicable once such constraints are incorporated. The reported gains are therefore valid within the modeled regime; we do not claim universality beyond it. revision: partial

Circularity Check

0 steps flagged

No circularity: formulation and heuristic built from first-principles constraints

full rationale

The paper formulates the joint scheduling problem using standard DAG precedence, release times upon per-band sensing completion, and multi-core capacity limits, all defined directly from the problem statement without reference to fitted parameters or prior self-citations. The release-aware heuristic computes downstream makespan impact from these explicit constraints rather than reducing to an input quantity by construction. Simulation comparisons to the decoupled baseline are empirical validations under stated assumptions, with no self-definitional loops, uniqueness theorems imported from the authors, or ansatzes smuggled via citation. The latency-reduction claim is therefore an outcome of the model, not tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the scheduling model implicitly assumes continuous sensing-time allocation and standard DAG precedence without additional constraints.

pith-pipeline@v0.9.0 · 5500 in / 1077 out tokens · 22780 ms · 2026-05-10T03:34:57.625977+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    Deep-learning-based multiband signal fusion for 3-D SAR superresolution,

    J. W. Smith and M. Torlak, “Deep-learning-based multiband signal fusion for 3-D SAR superresolution,”IEEE Trans. Aerosp. Electron. Syst., vol. 60, no. 1, pp. 8–24, Feb. 2024

  2. [2]

    Radar signal classification with multi-frequency multi-scale deformable convolutional networks and attention mecha- nisms,

    R. Liang and Y . Cen, “Radar signal classification with multi-frequency multi-scale deformable convolutional networks and attention mecha- nisms,”Remote Sens., vol. 16, no. 8, Art. no. 1431, Apr. 2024

  3. [3]

    Practical aspects of cognitive radar,

    A. F. Martoneet al., “Practical aspects of cognitive radar,” inProc. IEEE Radar Conf. (RadarConf20), Florence, Italy, 2020, pp. 1–6

  4. [4]

    Hybrid cognition for target tracking in cognitive radar networks,

    W. W. Howard and R. M. Buehrer, “Hybrid cognition for target tracking in cognitive radar networks,”IEEE Trans. Radar Syst., vol. 1, pp. 118– 131, 2023

  5. [5]

    Deep multimodal data fusion,

    F. Zhao, C. Zhang, B. Genget al., “Deep multimodal data fusion,”ACM Comput. Surv., vol. 56, no. 9, pp. 1–38, 2024

  6. [6]

    Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,

    D. Gehrig, M. R ¨uegg, M. Gehrig, J. Hidalgo-Carrio and D. Scaramuzza, “Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 2822–2829, Apr. 2021

  7. [7]

    StreamingFlow: Streaming occupancy forecasting with asynchronous multi-modal data streams via neural ordinary differential equation,

    Y . Shiet al., “StreamingFlow: Streaming occupancy forecasting with asynchronous multi-modal data streams via neural ordinary differential equation,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, W A, USA, 2024, pp. 21433–21442

  8. [8]

    Accelerating convolutional neural networks via inter-operator scheduling,

    Y . You, P. Liu, D.-Y . Hong, J.-J. Wu, and W.-C. Hsu, “Accelerating convolutional neural networks via inter-operator scheduling,” inProc. IEEE 28th Int. Conf. Parallel Distrib. Syst. (ICPADS), Nanjing, China, 2023, pp. 916–923

  9. [9]

    IOS: Inter-operator scheduler for CNN acceleration,

    Y . Ding, L. Zhu, Z. Jia, G. Pekhimenko, and S. Han, “IOS: Inter-operator scheduler for CNN acceleration,” inProc. Mach. Learn. Syst. (MLSys), virtual, 2021

  10. [10]

    Adaptive block-wise mapping with intra-block resource allocation for multi-DNN workloads on heterogeneous accelerator systems,

    Z. Nie, H. Wang, A. T. Chronopoulos, Z. Tang, K. Li, C. Liu, and Z. Xiao, “Adaptive block-wise mapping with intra-block resource allocation for multi-DNN workloads on heterogeneous accelerator systems,”IEEE Trans. Parallel Distrib. Syst., vol. 37, no. 4, pp. 1015–1031, Apr. 2026

  11. [11]

    New scheduling algorithm and analysis for partitioned periodic DAG tasks on multiprocessors,

    H. Liang, X. Jiang, J. Liu, X. Luo, S. Liu, N. Guan, and W. Yi, “New scheduling algorithm and analysis for partitioned periodic DAG tasks on multiprocessors,”IEEE Trans. Parallel Distrib. Syst., vol. 36, no. 12, pp. 2621–2634, Dec. 2025

  12. [12]

    Multiple-in-one photonic integrated transceiver for multi-chirp-rate & multi-band ISAR system and coherent fusion process- ing,

    N. Zhonget al., “Multiple-in-one photonic integrated transceiver for multi-chirp-rate & multi-band ISAR system and coherent fusion process- ing,”J. Lightw. Technol., vol. 42, no. 21, pp. 7434–7442, 2024

  13. [13]

    Multi-channel super-resolution reconstruction model based on dual-band weather radar fusion,

    S. Yanget al., “Multi-channel super-resolution reconstruction model based on dual-band weather radar fusion,”Remote Sens., vol. 18, no. 7, Art. no. 991, 2026

  14. [14]

    The dual-band SAR image fusion-based foliage- penetrating target detection method,

    H. Zhanget al., “The dual-band SAR image fusion-based foliage- penetrating target detection method,”IEEE Trans. Geosci. Remote Sens., vol. 62, Art. no. 5226513, 2024

  15. [15]

    Multi-subband radar signal fusion processing based on deep neural network in low signal-to-noise ratio,

    Y . Jiang, S. Tang, M. Lu, and L. Zhang, “Multi-subband radar signal fusion processing based on deep neural network in low signal-to-noise ratio,”Wireless Commun. Mobile Comput., vol. 2022, Art. no. 9518542, 2022

  16. [16]

    Multiband radar signal fusion and extrapolation method based on transformer model,

    C. Gong, W. Li, R. Lu, and R. Wang, “Multiband radar signal fusion and extrapolation method based on transformer model,” inProc. IEEE Int. Conf. Signal, Inf. Data Process. (ICSIDP), 2024

  17. [17]

    SAR image fusion classification based on the decision-level combination of multi-band information,

    J. Zhu, J. Pan, W. Jiang, X. Yue, and P. Yin, “SAR image fusion classification based on the decision-level combination of multi-band information,”Remote Sens., vol. 14, no. 9, Art. no. 2243, 2022

  18. [18]

    ViT–KAN synergistic fusion: A novel framework for parameter-efficient multi-band PolSAR land cover classification,

    S. Han, D. Ren, F. Gao, J. Yang, and H. Ma, “ViT–KAN synergistic fusion: A novel framework for parameter-efficient multi-band PolSAR land cover classification,”Remote Sens., vol. 17, no. 8, Art. no. 1470, 2025

  19. [19]

    Dual-band polarimetric HRRP recognition via a brain-inspired multi-channel fusion feature extraction network,

    W. Yang, Q. Zhou, M. Yuan, Y . Li, Y . Wang, and L. Zhang, “Dual-band polarimetric HRRP recognition via a brain-inspired multi-channel fusion feature extraction network,”Front. Neurosci., vol. 17, Art. no. 1252179, 2023

  20. [20]

    Dual-band HRRP recognition via wavelet packet decomposition and redundancy reduction model,

    W. Yang, Z. Qi, H. Wu, Y . Li, L. Zhang, and Y . Wang, “Dual-band HRRP recognition via wavelet packet decomposition and redundancy reduction model,” inProc. IEEE Int. Conf. Signal, Inf. Data Process. (ICSIDP), 2024

  21. [21]

    Dual-band HRRP fusion recognition via wavelet decomposition embedded autoen- coder,

    W. Wang, Z. Qi, L. Wang, W. Yang, L. Zhang, and Y . Wang, “Dual-band HRRP fusion recognition via wavelet decomposition embedded autoen- coder,” inProc. IEEE Int. Conf. Signal, Inf. Data Process. (ICSIDP), 2024

  22. [22]

    Missing modality completion for multi-frequency radar HRRP recognition using GAN,

    Q. Zhouet al., “Missing modality completion for multi-frequency radar HRRP recognition using GAN,” inProc. IEEE Int. Conf. Signal, Inf. Data Process. (ICSIDP), 2024

  23. [23]

    A multi-neural network acceleration architecture,

    E. Baek, D. Kwon, and J. Kim, “A multi-neural network acceleration architecture,” inProc. ACM/IEEE 47th Annu. Int. Symp. Comput. Archit. (ISCA), 2020, pp. 940–953

  24. [24]

    Magma: An optimization framework for mapping multiple DNNs on multiple accelerator cores,

    S.-C. Kao and T. Krishna, “Magma: An optimization framework for mapping multiple DNNs on multiple accelerator cores,” inProc. IEEE Int. Symp. High-Performance Comput. Archit. (HPCA), 2022, pp. 814– 830

  25. [25]

    Memory and computation coordi- nated mapping of DNNs onto complex heterogeneous SoC,

    S. Zheng, S. Chen, and Y . Liang, “Memory and computation coordi- nated mapping of DNNs onto complex heterogeneous SoC,” inProc. ACM/IEEE 60th Design Autom. Conf. (DAC), 2023, pp. 1–6

  26. [26]

    MoCA: Memory-centric, adaptive execution for multi-tenant deep neural networks,

    S. Kim, H. Genc, V . V . Nikiforov, K. Asanovi ´c, B. Nikoli ´c, and Y . S. Shao, “MoCA: Memory-centric, adaptive execution for multi-tenant deep neural networks,” inProc. IEEE Int. Symp. High-Performance Comput. Archit. (HPCA), 2023, pp. 828–841

  27. [27]

    Heterogeneous dataflow accelerators for multi-DNN workloads,

    H. Kwon, L. Lai, M. Pellauer, T. Krishna, Y .-H. Chen, and V . Chandra, “Heterogeneous dataflow accelerators for multi-DNN workloads,” in Proc. IEEE Int. Symp. High-Performance Comput. Archit. (HPCA), 2021, pp. 71–83

  28. [28]

    DREAM: A dynamic scheduler for dynamic real-time multi-model ML workloads,

    S. Kim, H. Kwon, J. Song, J. Jo, Y .-H. Chen, L. Lai, and V . Chandra, “DREAM: A dynamic scheduler for dynamic real-time multi-model ML workloads,” inProc. 28th ACM Int. Conf. Archit. Support Program. Lang. Oper . Syst. (ASPLOS), vol. 4, 2023, pp. 73–86

  29. [29]

    Sparse-DySta: Sparsity- aware dynamic and static scheduling for sparse multi-DNN workloads,

    H. Fan, S. I. Venieris, A. Kouris, and N. Lane, “Sparse-DySta: Sparsity- aware dynamic and static scheduling for sparse multi-DNN workloads,” inProc. 56th Annu. IEEE/ACM Int. Symp. Microarchitecture (MICRO), 2023, pp. 353–366

  30. [30]

    TaiChi: Efficient execution for multi-DNNs using graph- based scheduling,

    X. Zhouet al., “TaiChi: Efficient execution for multi-DNNs using graph- based scheduling,” inProc. Design, Autom. Test Europe Conf. Exhib. (DATE), Lyon, France, 2025, pp. 1–7