Lagrange Index based Scheduling for Minimizing Age of Updates from Heterogeneous Sources

Aniket Mukherjee; Chandramani Singh; Joy Kuri

arxiv: 2604.18077 · v1 · submitted 2026-04-20 · 💻 cs.NI · cs.PF

Lagrange Index based Scheduling for Minimizing Age of Updates from Heterogeneous Sources

Aniket Mukherjee , Joy Kuri , Chandramani Singh This is my paper

Pith reviewed 2026-05-10 03:54 UTC · model grok-4.3

classification 💻 cs.NI cs.PF

keywords Age of InformationSchedulingRestless Multi-armed BanditSemi-Markov Decision ProcessHeterogeneous SourcesNon-preemptive TransmissionWireless Uplink

0 comments

The pith

A Lagrange index heuristic minimizes weighted average age of updates from heterogeneous sensors under non-preemptive transmissions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models scheduling of sensor updates with varying packet sizes in a single-hop wireless uplink as a restless multi-armed bandit problem whose stages depend on the chosen action. It proposes a Lagrange index heuristic to minimize the weighted average age of information cost and shows that structural properties of the heuristic allow fast index computation when weights are present. Numerical tests indicate that the resulting policy produces lower weighted age costs than standard non-preemptive schedulers, offering a workable method for systems that must keep heterogeneous data fresh.

Core claim

Formulating the non-preemptive heterogeneous update problem as an RMAB with SMDP dynamics yields a Lagrange index heuristic whose structural properties support efficient computation and deliver consistent reductions in weighted average AoI cost relative to existing non-preemptive policies.

What carries the argument

The Lagrange index based heuristic derived from the RMAB-SMDP model, which exploits structural properties to compute indices efficiently for the weighted AoI objective.

Load-bearing premise

The structural properties of the Lagrange index heuristic hold and enable efficient computation while preserving performance gains over standard policies.

What would settle it

Run the Lagrange index policy and the best existing non-preemptive policy on identical heterogeneous source instances and measure the resulting weighted average AoI; absence of a measurable reduction would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2604.18077 by Aniket Mukherjee, Chandramani Singh, Joy Kuri.

**Figure 1.** Figure 1: N sources with heterogeneous update sizes, connected to a receiver through an unreliable wireless channel. A. Related Work We now briefly discuss the related work on (a) AoI minimization and on (b) Lagrange index based heuristics for RMAB problems. 1) AoI Minimization: The concept of AoI was introduced in [3], and a comprehensive overview of its extensions and practical relevance was provided in [6]. The … view at source ↗

**Figure 2.** Figure 2: State transition from tk to tk+1 given that source i is scheduled at tk. We assume Li = 2. We can formulate the scheduling problem as a long-term average cost control problem. A stationary admissible policy is a mapping from the state space Z N + to the set of N-dimensional standard unit vectors. So, given the initial state, any policy π induces a set of actions a(k) ∈ {0, 1} N , k ≥ 1 such that X N j=1 aj… view at source ↗

**Figure 3.** Figure 3: Simulation results under varying channel reliability. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 6.** Figure 6: Average weighted AoI versus α2. We consider N = 10 flows, with five class–1 sources having (L1, α1) = (2, 12) and five class–2 sources having (L2, α2) = (15, α2), where α2 ∈ {1, 3, 5, 7, 9}. The left panel corresponds to p = 0.7, and the right panel to p = 1. The Whittle index and Lagrange index policies achieve nearly identical performance. [8] V. Tripathi and E. Modiano, “A whittle index approach to mini… view at source ↗

**Figure 5.** Figure 5: Average cost versus the number of sources. We consider [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Modern sensing systems generate heterogeneous updates ranging from small status packets to large data objects. We study a single-hop wireless uplink network where sensors generate updates at will, each consisting of a sensor dependent number of packets. Under a strict medium-access constraint and non-preemptive (no-switching) transmissions, decision stages become action-dependent and stochastic. We formulate the problem as a restless multi-armed bandit (RMAB) with semi-Markov decision process (SMDP) dynamics and develop a Lagrange index based heuristic for minimizing weighted average AoI cost. For the weighted AoI setting, we utilize the structural properties of the heuristic to enable efficient index computation. Numerical results demonstrate consistent performance gains over existing non-preemptive scheduling policies, providing a practical solution for heterogeneous freshness-aware systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A workable Lagrange-index heuristic for weighted AoI under non-preemptive heterogeneous packet sizes, with numerical gains but limited theoretical depth.

read the letter

This paper develops a Lagrange-index heuristic for minimizing weighted average AoI in a single-hop wireless uplink where sensors send updates of varying packet lengths under strict non-preemptive medium access. The model treats the problem as a restless multi-armed bandit with semi-Markov dynamics to capture the fact that transmission durations depend on the chosen action and are stochastic. Structural properties of the indices are then used to make computation feasible for the weighted case, and simulations show consistent improvements over prior non-preemptive policies. That combination of heterogeneity, non-preemption, and index-based tractability is the concrete advance here. It addresses a realistic constraint in IoT-style freshness systems without assuming preemption or uniform packet sizes. The numerical evidence of gains is the main support offered, and the formulation itself follows established RMAB-SMDP lines for AoI. The soft spots are the usual ones for this style of work: no optimality proof, reliance on heuristics, and thin detail in the abstract on exact baselines, parameter ranges, or error bars. The gains appear real in the reported tests but we lack enough on regimes or variance to judge robustness. Citations stay within the standard AoI and bandit literature without obvious gaps or over-claiming. This is for people building scheduling rules for mixed-size status updates in wireless sensor networks or similar constrained links. A reader already working on index policies or AoI with variable service times will see the most direct value. It deserves peer review because the problem is well-posed, the heuristic is computable, and the numerical comparison provides a starting point for further validation, even if the paper will need more on assumptions and comparisons.

Referee Report

2 major / 2 minor

Summary. The paper formulates non-preemptive scheduling of heterogeneous updates (sensor-dependent packet counts) in a single-hop wireless uplink as a restless multi-armed bandit (RMAB) with semi-Markov decision process (SMDP) dynamics, owing to action-dependent stochastic transmission stages. It proposes a Lagrange-index heuristic for minimizing weighted average AoI, exploits structural properties of the heuristic to enable efficient index computation in the weighted-AoI case, and reports numerical results showing consistent gains over existing non-preemptive policies.

Significance. If the structural properties and numerical gains hold under the stated model, the work supplies a practical, index-based heuristic for freshness-aware scheduling with heterogeneous packet sizes, extending RMAB techniques to SMDP settings where standard index policies may not apply directly.

major comments (2)

[Section on Lagrange index heuristic and structural properties] The central claim that structural properties of the Lagrange index enable efficient computation is load-bearing for the contribution, yet the manuscript provides no explicit derivation or statement of those properties (e.g., monotonicity, indexability conditions, or closed-form reduction) in the section developing the heuristic; without this, the tractability assertion cannot be verified independently of the numerical results.
[Numerical results section] Numerical results are invoked to demonstrate performance gains, but the manuscript does not report simulation parameters (channel rates, packet-size distributions, arrival processes), baseline policy implementations, or statistical error bars; this leaves the cross-policy comparison unverifiable and weakens the empirical support for the heuristic.

minor comments (2)

[Problem formulation] Clarify the precise definition of the weighted AoI cost function and how the Lagrange multiplier is chosen or updated in the index policy.
[Numerical results] Add a brief comparison table or plot legend that explicitly lists the competing non-preemptive policies and their computational complexity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below and commit to revisions that strengthen the verifiability of both the theoretical claims and the empirical results.

read point-by-point responses

Referee: [Section on Lagrange index heuristic and structural properties] The central claim that structural properties of the Lagrange index enable efficient computation is load-bearing for the contribution, yet the manuscript provides no explicit derivation or statement of those properties (e.g., monotonicity, indexability conditions, or closed-form reduction) in the section developing the heuristic; without this, the tractability assertion cannot be verified independently of the numerical results.

Authors: We agree that the absence of an explicit derivation weakens the ability to verify the tractability claim independently. In the revised manuscript we will insert a dedicated subsection (immediately following the definition of the Lagrange index) that (i) proves monotonicity of the index with respect to the weighted AoI state, (ii) establishes indexability of the SMDP under the given cost structure, and (iii) derives the closed-form reduction that replaces exhaustive search with a simple threshold computation. These steps will be presented with all intermediate lemmas and proofs, allowing readers to confirm the efficiency result without relying solely on numerical evidence. revision: yes
Referee: [Numerical results section] Numerical results are invoked to demonstrate performance gains, but the manuscript does not report simulation parameters (channel rates, packet-size distributions, arrival processes), baseline policy implementations, or statistical error bars; this leaves the cross-policy comparison unverifiable and weakens the empirical support for the heuristic.

Authors: We acknowledge that the current numerical section lacks the necessary implementation details. In the revision we will add a new subsection (or an expanded appendix) that fully specifies: (a) channel success probabilities and transmission rates, (b) the exact heterogeneous packet-size distributions used for each sensor, (c) the arrival process (including any inter-arrival statistics), (d) precise algorithmic descriptions or pseudocode for all baseline policies, and (e) performance curves accompanied by error bars obtained from at least 100 independent Monte-Carlo runs together with 95 % confidence intervals. These additions will render the reported gains fully reproducible and statistically verifiable. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper formulates the heterogeneous AoI scheduling problem as a standard RMAB-SMDP with action-dependent stages, then proposes a Lagrange-index heuristic whose structural properties are invoked only for computational tractability. No derivation step reduces by construction to a fitted parameter, self-referential prediction, or load-bearing self-citation chain. The central claim remains a heuristic with external numerical validation rather than an optimality proof or renamed input. The derivation chain is self-contained against the stated model assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests primarily on the domain assumption that the system is accurately captured by an RMAB with SMDP dynamics under non-preemptive constraints, plus the existence of exploitable structural properties for the Lagrange index.

axioms (1)

domain assumption The single-hop wireless uplink with heterogeneous packet sizes, strict medium access, and non-preemptive transmissions produces action-dependent stochastic decision stages that are well-modeled as a restless multi-armed bandit with semi-Markov dynamics.
This modeling choice is stated directly in the abstract as the basis for the Lagrange index heuristic.

pith-pipeline@v0.9.0 · 5426 in / 1378 out tokens · 80722 ms · 2026-05-10T03:54:01.840093+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

Sensor networks: an overview,

M. Tubaishat and S. Madria, “Sensor networks: an overview,”IEEE Potentials, vol. 22, no. 2, pp. 20–23, 2003

work page 2003
[2]

Minimizing age of infor- mation in vehicular networks,

S. Kaul, M. Gruteser, V . Rai, and J. Kenney, “Minimizing age of infor- mation in vehicular networks,” in2011 8th Annual IEEE Communica- tions Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, pp. 350–358, 2011

work page 2011
[3]

Real-time status: How often should one update?,

S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?,” in2012 Proceedings IEEE INFOCOM, pp. 2731–2735, 2012

work page 2012
[4]

Age of information: A new concept, metric, and tool,

A. Kosta, N. Pappas, and V . Angelakis, “Age of information: A new concept, metric, and tool,”Foundations and Trends in Networking, vol. 12, pp. 162–259, 11 2017

work page 2017
[5]

Wiswarm: Age-of-information-based wireless networking for collaborative teams of uavs,

V . Tripathi, I. Kadota, E. Tal, M. S. Rahman, A. Warren, S. Karaman, and E. Modiano, “Wiswarm: Age-of-information-based wireless networking for collaborative teams of uavs,” inIEEE INFOCOM 2023-IEEE Con- ference on Computer Communications, pp. 1–10, IEEE, 2023

work page 2023
[6]

Age of information: An introduction and survey,

R. D. Yates, Y . Sun, D. R. Brown, S. K. Kaul, E. Modiano, and S. Ulukus, “Age of information: An introduction and survey,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 5, pp. 1183– 1210, 2021

work page 2021
[7]

Scheduling policies for minimizing age of information in broadcast wireless networks,

I. Kadota, A. Sinha, E. Uysal-Biyikoglu, R. Singh, and E. Modiano, “Scheduling policies for minimizing age of information in broadcast wireless networks,”IEEE/ACM Transactions on Networking, vol. 26, no. 6, pp. 2637–2650, 2018. (a)p= 0.7 (b)p= 1 Fig. 6: Average weighted AoI versusα 2. We considerN= 10 flows, with five class–1 sources having(L 1, α1) = (2,...

work page 2018
[8]

A whittle index approach to minimizing functions of age of information,

V . Tripathi and E. Modiano, “A whittle index approach to minimizing functions of age of information,”IEEE/ACM Transactions on Network- ing, vol. 32, no. 6, pp. 5144–5158, 2024

work page 2024
[9]

Computation and communication co-design for real-time monitoring and control in multi- agent systems,

V . Tripathi, L. Ballotta, L. Carlone, and E. Modiano, “Computation and communication co-design for real-time monitoring and control in multi- agent systems,” in2021 19th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), pp. 1– 8, 2021

work page 2021
[10]

Minimum age of information in the internet of things with non-uniform status packet sizes,

B. Zhou and W. Saad, “Minimum age of information in the internet of things with non-uniform status packet sizes,”IEEE Transactions on Wireless Communications, vol. 19, no. 3, pp. 1933–1947, 2020

work page 1933
[11]

Optimizing age of information in networks with large and small updates,

Z. Zhao, V . Tripathi, and I. Kadota, “Optimizing age of information in networks with large and small updates,” in2025 23rd International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–8, 2025

work page 2025
[12]

Asymptotically optimal priority policies for indexable and nonindexable restless bandits,

I. M. Verloop, “Asymptotically optimal priority policies for indexable and nonindexable restless bandits,”The Annals of Applied Probability, vol. 26, no. 4, pp. 1947–1995, 2016

work page 1947
[13]

Linear program-based policies for rest- less bandits: Necessary and sufficient conditions for (exponentially fast) asymptotic optimality,

N. Gast, B. Gaujal, and C. Yan, “Linear program-based policies for rest- less bandits: Necessary and sufficient conditions for (exponentially fast) asymptotic optimality,”Mathematics of Operations Research, vol. 49, p. 2468–2491, Nov. 2024

work page 2024
[14]

Lagrangian index policy for restless bandits with average reward,

K. Avrachenkov, V . S. Borkar, and P. Shah, “Lagrangian index policy for restless bandits with average reward,”arXiv preprint arXiv:2412.12641, 2024

work page arXiv 2024
[15]

D. P. Bertsekas,Dynamic Programming and Optimal Control, Vol. II. Athena Scientific, 3rd ed., 2007

work page 2007
[16]

S. M. Ross,Applied Probability Models with Optimization Applications. New York: Dover Publications, 1992

work page 1992
[17]

M. L. Puterman,Markov Decision Processes: Discrete Stochastic Dy- namic Programming. USA: John Wiley & Sons, Inc., 1st ed., 1994

work page 1994
[18]

L. I. Sennott,Stochastic dynamic programming and the control of queueing systems. John Wiley & Sons, 1998

work page 1998
[19]

Shaked and J

M. Shaked and J. G. Shanthikumar,Stochastic orders. Springer, 2007. APPENDIX A. Proof of Lemma 1 Proof.Step 1: Convergence of RVI. Consider the single–flow SMDP under a fixed multiplierλ. Letπ (i) denote the stationary policy that schedules flowiat every decision stage. Under this policy, the AoI process{v(k)}evolves as fol- lows: v→ ( v+ 1,with probabili...

work page 2007

[1] [1]

Sensor networks: an overview,

M. Tubaishat and S. Madria, “Sensor networks: an overview,”IEEE Potentials, vol. 22, no. 2, pp. 20–23, 2003

work page 2003

[2] [2]

Minimizing age of infor- mation in vehicular networks,

S. Kaul, M. Gruteser, V . Rai, and J. Kenney, “Minimizing age of infor- mation in vehicular networks,” in2011 8th Annual IEEE Communica- tions Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, pp. 350–358, 2011

work page 2011

[3] [3]

Real-time status: How often should one update?,

S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?,” in2012 Proceedings IEEE INFOCOM, pp. 2731–2735, 2012

work page 2012

[4] [4]

Age of information: A new concept, metric, and tool,

A. Kosta, N. Pappas, and V . Angelakis, “Age of information: A new concept, metric, and tool,”Foundations and Trends in Networking, vol. 12, pp. 162–259, 11 2017

work page 2017

[5] [5]

Wiswarm: Age-of-information-based wireless networking for collaborative teams of uavs,

V . Tripathi, I. Kadota, E. Tal, M. S. Rahman, A. Warren, S. Karaman, and E. Modiano, “Wiswarm: Age-of-information-based wireless networking for collaborative teams of uavs,” inIEEE INFOCOM 2023-IEEE Con- ference on Computer Communications, pp. 1–10, IEEE, 2023

work page 2023

[6] [6]

Age of information: An introduction and survey,

R. D. Yates, Y . Sun, D. R. Brown, S. K. Kaul, E. Modiano, and S. Ulukus, “Age of information: An introduction and survey,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 5, pp. 1183– 1210, 2021

work page 2021

[7] [7]

Scheduling policies for minimizing age of information in broadcast wireless networks,

I. Kadota, A. Sinha, E. Uysal-Biyikoglu, R. Singh, and E. Modiano, “Scheduling policies for minimizing age of information in broadcast wireless networks,”IEEE/ACM Transactions on Networking, vol. 26, no. 6, pp. 2637–2650, 2018. (a)p= 0.7 (b)p= 1 Fig. 6: Average weighted AoI versusα 2. We considerN= 10 flows, with five class–1 sources having(L 1, α1) = (2,...

work page 2018

[8] [8]

A whittle index approach to minimizing functions of age of information,

V . Tripathi and E. Modiano, “A whittle index approach to minimizing functions of age of information,”IEEE/ACM Transactions on Network- ing, vol. 32, no. 6, pp. 5144–5158, 2024

work page 2024

[9] [9]

Computation and communication co-design for real-time monitoring and control in multi- agent systems,

V . Tripathi, L. Ballotta, L. Carlone, and E. Modiano, “Computation and communication co-design for real-time monitoring and control in multi- agent systems,” in2021 19th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), pp. 1– 8, 2021

work page 2021

[10] [10]

Minimum age of information in the internet of things with non-uniform status packet sizes,

B. Zhou and W. Saad, “Minimum age of information in the internet of things with non-uniform status packet sizes,”IEEE Transactions on Wireless Communications, vol. 19, no. 3, pp. 1933–1947, 2020

work page 1933

[11] [11]

Optimizing age of information in networks with large and small updates,

Z. Zhao, V . Tripathi, and I. Kadota, “Optimizing age of information in networks with large and small updates,” in2025 23rd International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–8, 2025

work page 2025

[12] [12]

Asymptotically optimal priority policies for indexable and nonindexable restless bandits,

I. M. Verloop, “Asymptotically optimal priority policies for indexable and nonindexable restless bandits,”The Annals of Applied Probability, vol. 26, no. 4, pp. 1947–1995, 2016

work page 1947

[13] [13]

Linear program-based policies for rest- less bandits: Necessary and sufficient conditions for (exponentially fast) asymptotic optimality,

N. Gast, B. Gaujal, and C. Yan, “Linear program-based policies for rest- less bandits: Necessary and sufficient conditions for (exponentially fast) asymptotic optimality,”Mathematics of Operations Research, vol. 49, p. 2468–2491, Nov. 2024

work page 2024

[14] [14]

Lagrangian index policy for restless bandits with average reward,

K. Avrachenkov, V . S. Borkar, and P. Shah, “Lagrangian index policy for restless bandits with average reward,”arXiv preprint arXiv:2412.12641, 2024

work page arXiv 2024

[15] [15]

D. P. Bertsekas,Dynamic Programming and Optimal Control, Vol. II. Athena Scientific, 3rd ed., 2007

work page 2007

[16] [16]

S. M. Ross,Applied Probability Models with Optimization Applications. New York: Dover Publications, 1992

work page 1992

[17] [17]

M. L. Puterman,Markov Decision Processes: Discrete Stochastic Dy- namic Programming. USA: John Wiley & Sons, Inc., 1st ed., 1994

work page 1994

[18] [18]

L. I. Sennott,Stochastic dynamic programming and the control of queueing systems. John Wiley & Sons, 1998

work page 1998

[19] [19]

Shaked and J

M. Shaked and J. G. Shanthikumar,Stochastic orders. Springer, 2007. APPENDIX A. Proof of Lemma 1 Proof.Step 1: Convergence of RVI. Consider the single–flow SMDP under a fixed multiplierλ. Letπ (i) denote the stationary policy that schedules flowiat every decision stage. Under this policy, the AoI process{v(k)}evolves as fol- lows: v→ ( v+ 1,with probabili...

work page 2007