A Techno-Economic Framework for Cost Modeling and Revenue Opportunities in Open and Programmable AI-RAN

Gabriele Gemmi; Michele Polese; Tommaso Melodia

arxiv: 2603.28680 · v3 · pith:5GAY53YJnew · submitted 2026-03-30 · 💻 cs.NI

A Techno-Economic Framework for Cost Modeling and Revenue Opportunities in Open and Programmable AI-RAN

Gabriele Gemmi , Michele Polese , Tommaso Melodia This is my paper

Pith reviewed 2026-05-19 16:50 UTC · model grok-4.3

classification 💻 cs.NI

keywords AI-RANtechno-economic analysisGPU sharing5G cost modelingAI inference revenueidle capacity monetization6G economic viabilityRAN acceleration

0 comments

The pith

GPU-based RAN hardware can deliver up to 8x return on investment by leasing idle capacity to AI inference workloads.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a model that adds publicly available 5G Layer-1 performance numbers on GPU platforms to realistic mobile traffic patterns and LLM inference demand curves. It then subtracts the higher capital and operating costs of GPU equipment from the revenue that could be earned by renting out spare cycles during low-traffic hours. The calculation shows that the extra spending is more than recovered across many combinations of token prices, demand levels, and serving densities. A reader would care because the result supplies a quantitative reason for operators to choose programmable, accelerator-rich radio equipment instead of staying with cheaper but less flexible servers.

Core claim

The additional capital and operational expenditures of GPU-heavy deployments are offset by AI-on-RAN revenue, yielding a return on investment of up to 8x across scenarios that include token depreciation, varying demand dynamics, and diverse GPU serving densities.

What carries the argument

A joint cost-and-revenue model that first quantifies surplus GPU cycles left after meeting 5G traffic demands and then prices those cycles as capacity leased to AI tenants.

If this is right

Mobile operators gain a direct financial incentive to adopt GPU-accelerated and open RAN platforms.
The same hardware pool can serve both radio functions during peak hours and AI workloads during off-peak hours without separate capital outlays.
Revenue from AI tenants can cover the cost premium of GPU servers over conventional x86 deployments.
The economic case for future 6G rollouts improves when idle RAN capacity is treated as a sellable resource.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Operators might site GPU RAN equipment first in regions with nearby AI customers to maximize leasing revenue.
Standards that make RAN platforms more programmable would lower the friction of offering spare cycles to external AI workloads.
The framework could be extended to include other edge workloads such as video analytics or sensor processing that also need GPU cycles.
If real deployments match the model, regulators might consider incentives for shared-infrastructure deployments that improve overall compute utilization.

Load-bearing premise

Public benchmarks of 5G Layer-1 processing on different hardware platforms, together with standard traffic and AI demand models, are close enough to real operating conditions to support the projected costs and revenues.

What would settle it

A measured year-long trace from an actual GPU-equipped RAN site that records exact power draw, utilization, maintenance costs, and any revenue collected from AI tenants, then compares the realized return against the model's forecast.

Figures

Figures reproduced from arXiv: 2603.28680 by Gabriele Gemmi, Michele Polese, Tommaso Melodia.

**Figure 2.** Figure 2: TCO for 10 Gbps aggregate peak throughput over 10 years. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Hourly GPU allocation at deployment (w = 0) for clusters sized for Scenario 1 (top) and Scenario 2 (bottom). The total deployed capacity Gtotal is split at each hour between RAN processing and LLM inference. 0 10 20 30 GPUs GˆRAN(w) ρdens = 3 Gtot GˆLLM alloc (w) ρdens = 12.87 0 100 200 300 400 500 0 50 100 150 w [weeks] GPUs [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Weekly-averaged GPU allocation (RAN plus LLM) over the 10-year [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 7.** Figure 7: CapEx for the Milan network for Aerial and FlexRAN under Scenario 1 [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 6.** Figure 6: Weekly LLM gross revenue over the deployment lifetime under [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 8.** Figure 8: Cumulative LLM revenue R (Eq. (19)) over the deployment horizon under different values of k = ρtok/ρdens. Top: Scenario 1. Bottom: Scenario 2. The horizontal dashed line indicates the marginal investment I (Eq. (18)). during low-traffic periods (e.g., overnight), so the incremental energy expenditure is directly tied to the LLM workload and is therefore accounted for in the net return R (Eq. (19)). In Sce… view at source ↗

**Figure 9.** Figure 9: Return on investment (R/I) of AI-RAN by scenario and depreciation ratio k = ρtok/ρdens. Investment I and return R defined in Eqs. (18) and (19). around week 235. The k=2 curve never reaches break-even: the cumulative revenue at week 520 ($0.35M) falls far short of the $0.62M threshold, confirming that token deflation at twice the rate of efficiency improvement makes the investment irrecoverable on this tim… view at source ↗

read the original abstract

The large-scale deployment of 5G networks has not delivered the expected return on investment for mobile network operators, raising concerns about the economic viability of future 6G rollouts. At the same time, surging demand for Artificial Intelligence (AI) inference and training workloads is straining global compute capacity. AI-RAN architectures, in which Radio Access Network (RAN) platforms accelerated on Graphics Processing Unit (GPU) share idle capacity with AI workloads during off-peak periods, offer a potential path to improved capital efficiency. However, the economic case for such systems remains unsubstantiated. In this paper, we present a techno-economic analysis of AI-RAN deployments by combining publicly available benchmarks of 5G Layer-1 processing on heterogeneous platforms -- from x86 servers with accelerators for channel coding to modern GPUs -- with realistic traffic models and AI service demand profiles for Large Language Model (LLM) inference. We construct a joint cost and revenue model that quantifies the surplus compute capacity available in GPU-based RAN deployments and evaluates the returns from leasing it to AI tenants. Our results show that, across a range of scenarios encompassing token depreciation, varying demand dynamics, and diverse GPU serving densities, the additional capital and operational expenditures of GPU-heavy deployments are offset by AI-on-RAN revenue, yielding a return on investment of up to 8x. These findings strengthen the long-term economic case for accelerator-based RAN architectures and future 6G deployments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a joint cost-revenue model for GPU-based AI-RAN that claims up to 8x ROI from leasing surplus capacity, but the supporting calculations and real-time feasibility checks are not shown clearly enough to evaluate.

read the letter

The paper's main takeaway is a techno-economic model for open AI-RAN that suggests GPU deployments can achieve up to 8x ROI by leasing surplus capacity to AI workloads during off-peak times. It builds this by combining public 5G Layer-1 processing benchmarks on various hardware with traffic models and LLM inference demand profiles. The joint cost and revenue framework then calculates available surplus and potential leasing income. This is a useful step because it moves beyond general discussions of programmable RAN to specific economic projections across scenarios involving token depreciation and different GPU serving densities. The approach earns credit for using existing data sources to create a practical analysis tool. It addresses the economic viability concerns for 6G by showing how AI can help offset costs. However, the presentation leaves some gaps. The abstract reports the ROI results but provides no equations, tables, or step-by-step validation, which makes it hard to assess the soundness of the calculations. The stress-test point about real-time RAN constraints is important to verify; if the model does not account for sub-millisecond deadlines and possible overlaps with AI demands, the estimated surplus capacity could be too high. Since everything rests on composed benchmarks rather than new measurements, that adds a layer of uncertainty. This paper would interest readers focused on network economics, 6G planning, and AI-RAN architectures. It offers a starting point for discussions on infrastructure sharing and capital efficiency. It deserves a serious referee because the question is relevant and the framework is a concrete contribution, though it will benefit from added transparency on the model and assumptions. I recommend sending it for peer review with feedback on detailing the derivations and checking schedulability.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a techno-economic framework for AI-RAN that integrates publicly available 5G Layer-1 processing benchmarks on heterogeneous platforms (x86 servers with accelerators through modern GPUs) with realistic traffic models and LLM inference demand profiles. It constructs a joint cost-revenue model to quantify surplus GPU capacity available for leasing to AI tenants during off-peak periods and evaluates the resulting returns, concluding that additional GPU capex/opex is offset by AI revenue to yield up to 8x ROI across scenarios that vary token depreciation, demand dynamics, and GPU serving densities.

Significance. If the projections are robust, the work supplies a quantitative basis for the economic viability of GPU-accelerated RAN platforms, directly addressing operator concerns about 5G/6G ROI by demonstrating capital efficiency gains through compute sharing with AI workloads. The reliance on public benchmarks and composable traffic/AI demand profiles is a practical strength that could be extended by operators for network planning.

major comments (2)

[§4] §4 (Surplus Capacity Model): The estimation of shareable idle GPU capacity is built from public 5G L1 benchmarks and average traffic profiles without explicit enforcement of sub-millisecond deterministic scheduling constraints or isolation margins for worst-case overlap between RAN peaks and variable-latency LLM inference. Because this surplus directly determines the revenue term in the ROI calculation, the 8x figure rests on an assumption whose validity is not demonstrated by end-to-end measurement or formal schedulability analysis.
[Results] Results (ROI sensitivity): The reported returns are shown across token depreciation, demand dynamics, and GPU density variations, yet no corresponding sensitivity is provided for tighter real-time isolation requirements. Adding such a stress test would be necessary to confirm that the central claim survives realistic RAN constraints.

minor comments (2)

[Abstract] Abstract: The claim of 'up to 8x ROI' is stated without reference to the underlying equations or data sources; a single sentence pointing to the key modeling assumptions would improve clarity.
[Notation] Notation: Ensure that 'surplus compute capacity' is defined consistently when moving between the cost model and the revenue model to prevent reader confusion.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address each of the major comments below and indicate the revisions made to the manuscript.

read point-by-point responses

Referee: §4 (Surplus Capacity Model): The estimation of shareable idle GPU capacity is built from public 5G L1 benchmarks and average traffic profiles without explicit enforcement of sub-millisecond deterministic scheduling constraints or isolation margins for worst-case overlap between RAN peaks and variable-latency LLM inference. Because this surplus directly determines the revenue term in the ROI calculation, the 8x figure rests on an assumption whose validity is not demonstrated by end-to-end measurement or formal schedulability analysis.

Authors: Our techno-economic framework intentionally employs publicly available benchmarks and average traffic profiles to model surplus capacity, as these provide a practical basis for the analysis without requiring proprietary data. We recognize that this approach does not incorporate explicit sub-millisecond scheduling constraints or worst-case isolation margins. To address this, we have added a discussion in the revised Section 4 on the potential impact of such constraints and how operators might adjust the model for their specific schedulers. We maintain that the 8x ROI represents the modeled scenario and serves as a benchmark for further refinement. revision: partial
Referee: Results (ROI sensitivity): The reported returns are shown across token depreciation, demand dynamics, and GPU density variations, yet no corresponding sensitivity is provided for tighter real-time isolation requirements. Adding such a stress test would be necessary to confirm that the central claim survives realistic RAN constraints.

Authors: We agree with the need for additional sensitivity analysis regarding real-time isolation. In the revised manuscript, we have extended the Results section to include sensitivity tests that vary the assumed isolation margins and scheduling tightness. These new results demonstrate that while tighter constraints reduce the available surplus and thus the ROI, the economic benefits remain significant, supporting the viability of AI-RAN even under more conservative assumptions. revision: yes

standing simulated objections not resolved

Conducting new end-to-end measurements or performing formal schedulability analysis, since the study relies on existing public benchmarks and modeling rather than original experimental deployments.

Circularity Check

0 steps flagged

No significant circularity; model uses external benchmarks and profiles as independent inputs

full rationale

The paper's derivation chain starts from publicly available 5G L1 benchmarks on heterogeneous platforms, combined with stated traffic models and AI demand profiles for LLM inference. It then constructs a joint cost/revenue model to quantify surplus GPU capacity and compute ROI (up to 8x). No equations or steps reduce the final ROI or surplus estimates to fitted parameters or self-citations by construction; the outputs are forward projections from the external data sources. The central claim remains independent of the target results, with no self-definitional loops, renamed known results, or load-bearing self-citations identified in the provided derivation description.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

Ledger populated from abstract only; specific numerical values and derivations unavailable. Free parameters reflect the varying scenario elements named in the abstract. Axioms capture the core modeling assumptions stated.

free parameters (3)

token depreciation
Varying rates considered across scenarios to test sensitivity of ROI
demand dynamics
Varying AI service demand profiles for LLM inference
GPU serving density
Diverse densities evaluated in the model

axioms (2)

domain assumption Publicly available benchmarks accurately represent 5G Layer-1 processing performance on x86 servers with accelerators and modern GPUs
Used as input to combine with traffic models
domain assumption Traffic models and AI service demand profiles are realistic representations of real-world conditions
Basis for projecting surplus capacity and revenue

pith-pipeline@v0.9.0 · 5794 in / 1441 out tokens · 60203 ms · 2026-05-19T16:50:16.099514+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

[1]

Global mobile trends 2023,

GSMA Intelligence, “Global mobile trends 2023,” 2023, accessed: 2025-06. [Online]. Available: https://www.gsma.com/

work page 2023
[2]

The economic potential of generative AI: The next productivity frontier,

McKinsey Global Institute, “The economic potential of generative AI: The next productivity frontier,” McKinsey & Company, Tech. Rep., Jun. 2023, accessed: 2025-06. [Online]. Available: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/ the-economic-potential-of-generative-ai-the-next-productivity-frontier

work page 2023
[3]

BurstGPT: A real-world workload dataset to optimize LLM serving systems,

Y . Wang, Y . Chenet al., “BurstGPT: A real-world workload dataset to optimize LLM serving systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’25). Toronto, ON, Canada: ACM, 2025. [Online]. Available: https://doi.org/10.1145/3711896.3737413

work page doi:10.1145/3711896.3737413 2025
[4]

Beyond connectivity: An open architecture for AI-RAN convergence in 6G,

M. Polese, N. Mohamadiet al., “Beyond connectivity: An open architecture for AI-RAN convergence in 6G,”IEEE Communications Magazine (to appear), 2025. [Online]. Available: https://arxiv.org/abs/ 2507.06911

work page arXiv 2025
[5]

Industry leaders form AI-RAN alliance,

AI-RAN Alliance, “Industry leaders form AI-RAN alliance,” 2024, mWC Barcelona. [Online]. Available: https://ai-ran.org/news/industry- leaders-in-ai-and-wireless-form-ai-ran-alliance/

work page 2024
[6]

AI-RAN: Transforming RAN with AI-driven computing infrastructure,

L. Kundu, X. Linet al., “AI-RAN: Transforming RAN with AI-driven computing infrastructure,”arXiv preprint arXiv:2501.09007, 2025. [Online]. Available: https://arxiv.org/abs/2501.09007

work page arXiv 2025
[7]

The interplay of AI-and- RAN: Dynamic resource allocation for converged 6G platform,

S. D. A. Shah, Z. Nezamiet al., “The interplay of AI-and- RAN: Dynamic resource allocation for converged 6G platform,” arXiv preprint arXiv:2503.07420, 2025. [Online]. Available: https: //arxiv.org/abs/2503.07420

work page arXiv 2025
[9]

Available: https://arxiv.org/abs/2507.09124

[Online]. Available: https://arxiv.org/abs/2507.09124

work page arXiv
[10]

Open RAN TCO analysis,

Analysys Mason, “Open RAN TCO analysis,” Tech. Rep., 2022, commissioned by Wind River. [Online]. Available: https://www.analysysmason.com/contentassets/ b3260036a0d449718117eeaf5ac83472/analysys mason open ran tco feb2022 rma16 rma18.pdf

work page 2022
[11]

Open RAN progress drives confidence,

——, “Open RAN progress drives confidence,” Tech. Rep., 2024, commissioned by Wind River. [Online]. Available: https://www.analysysmason.com/contentassets/ a99b7d01b9e64a2cafc375459c99de99/analysys mason open ran confidence apr2024 rma18.pdf

work page 2024
[12]

Open RAN and vRAN revenue trends,

S. Pongratz, “Open RAN and vRAN revenue trends,” 2024, dell’Oro Group. [Online]. Available: https://www.fierce-network.com/ modernization/open-ran-and-vran-tanks-2023

work page 2024
[13]

Inference economics of language models,

E. Erdil, “Inference economics of language models,”arXiv preprint arXiv:2506.04645, 2025. [Online]. Available: https://arxiv.org/abs/ 2506.04645

work page arXiv 2025
[14]

The price of progress: Algorithmic efficiency and the falling cost of AI inference,

H. Gundlach, J. Lynchet al., “The price of progress: Algorithmic efficiency and the falling cost of AI inference,”arXiv preprint arXiv:2511.23455, 2025. [Online]. Available: https://arxiv.org/abs/ 2511.23455

work page arXiv 2025
[15]

The emerging market for intelligence: Pricing, supply, and demand for LLMs,

M. Demirer, A. Fradkinet al., “The emerging market for intelligence: Pricing, supply, and demand for LLMs,” 2025, working paper. [Online]. Available: https://andreyfradkin.com/assets/LLM Demand 12 12 2025.pdf

work page 2025
[16]

Densing law of LLMs,

C. Xiao, J. Caiet al., “Densing law of LLMs,”Nature Machine Intelli- gence, vol. 7, pp. 1823–1833, 2025

work page 2025
[18]

Efficient inference for edge large language models: A survey,

G. Cai, R. Tianet al., “Efficient inference for edge large language models: A survey,”Tsinghua Science and Technology, vol. 31, no. 3, pp. 1365– 1380, 2026

work page 2026
[19]

Hybrid LLM: Cost-efficient and quality- aware query routing,

D. Ding, A. Mallicket al., “Hybrid LLM: Cost-efficient and quality- aware query routing,” inProc. ICLR, 2024. [Online]. Available: https://openreview.net/forum?id=02f3mUtqnM

work page 2024
[20]

NVIDIA Aerial GPU hosted AI-on-5G,

A. Kelkar and C. Dick, “NVIDIA Aerial GPU hosted AI-on-5G,” inProc. IEEE 4th 5G World Forum (5GWF), 2021, pp. 64–69

work page 2021
[21]

O-RAN: Disrupting the vir- tualized RAN ecosystem,

A. Garcia-Saavedra and X. Costa-Perez, “O-RAN: Disrupting the vir- tualized RAN ecosystem,”IEEE Communications Standards Magazine, vol. 5, no. 4, pp. 96–103, 2021

work page 2021
[22]

Enabling the world’s first GPU-accelerated 5G Open RAN for NTT DOCOMO with NVIDIA Aerial,

NVIDIA, “Enabling the world’s first GPU-accelerated 5G Open RAN for NTT DOCOMO with NVIDIA Aerial,” 2023, accessed: 2025-12. [On- line]. Available: https://developer.nvidia.com/blog/enabling-the-worlds- first-gpu-accelerated-5g-open-ran-for-ntt-docomo-with-nvidia-aerial/

work page 2023
[23]

Aerial CUDA-accelerated RAN release 25-2,

——, “Aerial CUDA-accelerated RAN release 25-2,” 2025, accessed: 2025-12. [Online]. Available: https://docs.nvidia.com/aerial/ cuda-accelerated-ran/25-2/aerial-cuda-accelerated-ran.pdf

work page 2025
[24]

QCT VRB100 vRAN server specifications,

Quanta Cloud Technology, “QCT VRB100 vRAN server specifications,” 2024, vendor specifications; Accessed: 2025-12. [Online]. Available: https://www.qct.io/

work page 2024
[25]

Verified reference configuration for virtualized RAN on the HPE ProLiant DL110,

Intel and Hewlett Packard Enterprise, “Verified reference configuration for virtualized RAN on the HPE ProLiant DL110,” 2022, accessed: 2025-

work page 2022
[26]

[Online]. Available: https://builders.intel.com/docs/networkbuilders/ intel-hpe-verified-reference-configuration-for-virtualized-radio-access- networks-on-the-hpe-proliant-dl110-1653673153.pdf

work page
[27]

Nvidia aerial gpu hosted ai-on-5g,

A. Kelkar and C. Dick, “Nvidia aerial gpu hosted ai-on-5g,” in2021 IEEE 4th 5G World Forum (5GWF). IEEE, 2021, pp. 64–69

work page 2021
[28]

Understanding 5G Performance on Hetero- geneous Computing Architectures,

C. Wang, H. Nieet al., “Understanding 5G Performance on Hetero- geneous Computing Architectures,”IEEE Communications Magazine, vol. 63, no. 3, pp. 107–113, March 2025

work page 2025
[29]

Intel accelerates 5G leadership with new products,

Intel Corporation, “Intel accelerates 5G leadership with new products,” 2023, accessed: 2025-12. [Online]. Available: https://www.intc.com/news-events/press-releases/detail/1606/ intel-accelerates-5g-leadership-with-new-products

work page 2023
[30]

Report ITU-R M.2412-0: Guidelines for evaluation of radio interface technologies for IMT-2020,

ITU-R, “Report ITU-R M.2412-0: Guidelines for evaluation of radio interface technologies for IMT-2020,” International Telecommunication Union, Tech. Rep., Oct. 2017, dense Urban-eMBB; Accessed: 2025-06. [Online]. Available: https://www.itu.int/dms pub/itu-r/opb/rep/R-REP- M.2412-2017-PDF-E.pdf

work page 2020
[31]

Recommendation ITU-R M.2160-0: Framework and overall objectives of the future development of IMT for 2030 and beyond,

——, “Recommendation ITU-R M.2160-0: Framework and overall objectives of the future development of IMT for 2030 and beyond,” International Telecommunication Union, Tech. Rep., Nov. 2023, accessed: 2025-06. [Online]. Available: https://www.itu.int/dms pubrec/ itu-r/rec/m/R-REC-M.2160-0-202311-I!!PDF-E.pdf

work page 2030
[32]

Report ITU-R M.2410-0: Minimum requirements related to technical performance for IMT-2020 radio interface(s),

——, “Report ITU-R M.2410-0: Minimum requirements related to technical performance for IMT-2020 radio interface(s),” International Telecommunication Union, Tech. Rep., Nov. 2017, accessed: 2025-06. [Online]. Available: https://www.itu.int/pub/R-REP-M.2410

work page 2020
[33]

A multi-source dataset of urban life in the city of Milan and the Province of Trentino,

G. Barlacchi, M. De Nadaiet al., “A multi-source dataset of urban life in the city of Milan and the Province of Trentino,”Scientific Data, vol. 2, p. 150055, 2015. [Online]. Available: https://doi.org/10.1038/sdata.2015.55

work page doi:10.1038/sdata.2015.55 2015
[34]

Burstgpt: A real-world workload dataset to optimize llm serving systems,

Y . Wang, Y . Chenet al., “BurstGPT: A real-world workload dataset to optimize LLM serving systems,”arXiv preprint arXiv:2401.17644, 2024. [Online]. Available: https://arxiv.org/abs/2401.17644

work page arXiv 2024
[35]

Claude vs. ChatGPT statistics 2026: Head-to-head numbers behind the AI battle,

R. A. Lee, “Claude vs. ChatGPT statistics 2026: Head-to-head numbers behind the AI battle,” 2025, 2.5–3B prompts/day, 190.6M DAU; Accessed: 2025-10. [Online]. Available: https://sqmagazine.co.uk/ claude-vs-chatgpt-statistics/

work page 2026
[36]

Resident population and population density – municipality of Milan,

ISTAT, “Resident population and population density – municipality of Milan,” 2024, population density≈7 500/km 2; Accessed: 2025-06. [Online]. Available: https://www.istat.it/en/

work page 2024
[37]

The state of mobile internet con- nectivity 2024,

GSMA Intelligence, “The state of mobile internet con- nectivity 2024,” 2024, accessed: 2025-06. [Online]. Avail- able: https://data.gsmaintelligence.com/research/research/research-2024/ the-state-of-mobile-internet-connectivity-2024

work page 2024
[38]

A comprehensive analysis of the impact of an increase in user devices on the long-term energy efficiency of 5G networks,

J. Lorincz and Z. Klarin, “A comprehensive analysis of the impact of an increase in user devices on the long-term energy efficiency of 5G networks,”Smart Cities, vol. 7, no. 6, pp. 3616–3657, 2024. [Online]. Available: https://www.mdpi.com/2624-6511/7/6/140

work page 2024
[39]

Preliminary draft new report ITU-R M.[IMT- 2030.TECH PERF REQ]: Minimum requirements related to technical performance for IMT-2030 radio interface(s),

ITU-R WP 5D, “Preliminary draft new report ITU-R M.[IMT- 2030.TECH PERF REQ]: Minimum requirements related to technical performance for IMT-2030 radio interface(s),” International Telecommu- nication Union, Tech. Rep., 2024, working document; Accessed: 2025- 06

work page 2030
[40]

TS 38.214: NR; physical layer procedures for data (release 18),

3GPP, “TS 38.214: NR; physical layer procedures for data (release 18),” 3rd Generation Partnership Project, Tech. Rep., 2024, accessed: 2025-06. [Online]. Available: https://www.3gpp.org/dynareport/38214.htm

work page 2024
[41]

2024 global data center sur- vey: Report,

Uptime Institute, “2024 global data center sur- vey: Report,” 2024, accessed: 2025-06. [Online]. Avail- able: https://datacenter.uptimeinstitute.com/rs/711-RIA-145/images/ 2024.GlobalDataCenterSurvey.Report.pdf

work page 2024
[42]

Electricity price statistics,

Eurostat, “Electricity price statistics,” 2024, household and industrial prices; 0.3291 EUR/kWh; Accessed: 2025-06. [Online]. Avail- able: https://ec.europa.eu/eurostat/statistics-explained/index.php?title= Electricity price statistics

work page 2024
[43]

Ericsson mobility report,

Ericsson, “Ericsson mobility report,” Tech. Rep., November 2025, accessed: February 23, 2026. [Online]. Available: https://www.ericsson.com/4aca6f/assets/local/reports-papers/mobility- report/documents/2025/ericsson-mobility-report-november-2025.pdf

work page 2025
[44]

LLM performance leaderboard: Model and API provider benchmarks,

Artificial Analysis, “LLM performance leaderboard: Model and API provider benchmarks,” 2024, accessed: 2025-06. [Online]. Available: https://artificialanalysis.ai/leaderboards/models

work page 2024
[45]

Performance characterization of expert router for scalable LLM inference,

J. Pichlmeier, P. Ross, and A. Luckow, “Performance characterization of expert router for scalable LLM inference,” inarXiv preprint arXiv:2404.15153, 2024. [Online]. Available: https://arxiv.org/abs/ 2404.15153

work page arXiv 2024
[46]

Together AI – inference pricing,

Together AI, “Together AI – inference pricing,” 2024, llama 3.3 70B: $0.88/Mtok; Accessed: 2025-06. [Online]. Available: https: //www.together.ai/pricing

work page 2024
[47]

Generative AI Practices, Literacy, and Divides: An Empirical Analysis in the Italian Context

B. Savoldi, G. Attanasioet al., “Generative ai practices, literacy, and divides: An empirical analysis in the italian context,” 2025. [Online]. Available: https://arxiv.org/abs/2512.03671

work page internal anchor Pith review arXiv 2025

[1] [1]

Global mobile trends 2023,

GSMA Intelligence, “Global mobile trends 2023,” 2023, accessed: 2025-06. [Online]. Available: https://www.gsma.com/

work page 2023

[2] [2]

The economic potential of generative AI: The next productivity frontier,

McKinsey Global Institute, “The economic potential of generative AI: The next productivity frontier,” McKinsey & Company, Tech. Rep., Jun. 2023, accessed: 2025-06. [Online]. Available: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/ the-economic-potential-of-generative-ai-the-next-productivity-frontier

work page 2023

[3] [3]

BurstGPT: A real-world workload dataset to optimize LLM serving systems,

Y . Wang, Y . Chenet al., “BurstGPT: A real-world workload dataset to optimize LLM serving systems,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’25). Toronto, ON, Canada: ACM, 2025. [Online]. Available: https://doi.org/10.1145/3711896.3737413

work page doi:10.1145/3711896.3737413 2025

[4] [4]

Beyond connectivity: An open architecture for AI-RAN convergence in 6G,

M. Polese, N. Mohamadiet al., “Beyond connectivity: An open architecture for AI-RAN convergence in 6G,”IEEE Communications Magazine (to appear), 2025. [Online]. Available: https://arxiv.org/abs/ 2507.06911

work page arXiv 2025

[5] [5]

Industry leaders form AI-RAN alliance,

AI-RAN Alliance, “Industry leaders form AI-RAN alliance,” 2024, mWC Barcelona. [Online]. Available: https://ai-ran.org/news/industry- leaders-in-ai-and-wireless-form-ai-ran-alliance/

work page 2024

[6] [6]

AI-RAN: Transforming RAN with AI-driven computing infrastructure,

L. Kundu, X. Linet al., “AI-RAN: Transforming RAN with AI-driven computing infrastructure,”arXiv preprint arXiv:2501.09007, 2025. [Online]. Available: https://arxiv.org/abs/2501.09007

work page arXiv 2025

[7] [7]

The interplay of AI-and- RAN: Dynamic resource allocation for converged 6G platform,

S. D. A. Shah, Z. Nezamiet al., “The interplay of AI-and- RAN: Dynamic resource allocation for converged 6G platform,” arXiv preprint arXiv:2503.07420, 2025. [Online]. Available: https: //arxiv.org/abs/2503.07420

work page arXiv 2025

[8] [9]

Available: https://arxiv.org/abs/2507.09124

[Online]. Available: https://arxiv.org/abs/2507.09124

work page arXiv

[9] [10]

Open RAN TCO analysis,

Analysys Mason, “Open RAN TCO analysis,” Tech. Rep., 2022, commissioned by Wind River. [Online]. Available: https://www.analysysmason.com/contentassets/ b3260036a0d449718117eeaf5ac83472/analysys mason open ran tco feb2022 rma16 rma18.pdf

work page 2022

[10] [11]

Open RAN progress drives confidence,

——, “Open RAN progress drives confidence,” Tech. Rep., 2024, commissioned by Wind River. [Online]. Available: https://www.analysysmason.com/contentassets/ a99b7d01b9e64a2cafc375459c99de99/analysys mason open ran confidence apr2024 rma18.pdf

work page 2024

[11] [12]

Open RAN and vRAN revenue trends,

S. Pongratz, “Open RAN and vRAN revenue trends,” 2024, dell’Oro Group. [Online]. Available: https://www.fierce-network.com/ modernization/open-ran-and-vran-tanks-2023

work page 2024

[12] [13]

Inference economics of language models,

E. Erdil, “Inference economics of language models,”arXiv preprint arXiv:2506.04645, 2025. [Online]. Available: https://arxiv.org/abs/ 2506.04645

work page arXiv 2025

[13] [14]

The price of progress: Algorithmic efficiency and the falling cost of AI inference,

H. Gundlach, J. Lynchet al., “The price of progress: Algorithmic efficiency and the falling cost of AI inference,”arXiv preprint arXiv:2511.23455, 2025. [Online]. Available: https://arxiv.org/abs/ 2511.23455

work page arXiv 2025

[14] [15]

The emerging market for intelligence: Pricing, supply, and demand for LLMs,

M. Demirer, A. Fradkinet al., “The emerging market for intelligence: Pricing, supply, and demand for LLMs,” 2025, working paper. [Online]. Available: https://andreyfradkin.com/assets/LLM Demand 12 12 2025.pdf

work page 2025

[15] [16]

Densing law of LLMs,

C. Xiao, J. Caiet al., “Densing law of LLMs,”Nature Machine Intelli- gence, vol. 7, pp. 1823–1833, 2025

work page 2025

[16] [18]

Efficient inference for edge large language models: A survey,

G. Cai, R. Tianet al., “Efficient inference for edge large language models: A survey,”Tsinghua Science and Technology, vol. 31, no. 3, pp. 1365– 1380, 2026

work page 2026

[17] [19]

Hybrid LLM: Cost-efficient and quality- aware query routing,

D. Ding, A. Mallicket al., “Hybrid LLM: Cost-efficient and quality- aware query routing,” inProc. ICLR, 2024. [Online]. Available: https://openreview.net/forum?id=02f3mUtqnM

work page 2024

[18] [20]

NVIDIA Aerial GPU hosted AI-on-5G,

A. Kelkar and C. Dick, “NVIDIA Aerial GPU hosted AI-on-5G,” inProc. IEEE 4th 5G World Forum (5GWF), 2021, pp. 64–69

work page 2021

[19] [21]

O-RAN: Disrupting the vir- tualized RAN ecosystem,

A. Garcia-Saavedra and X. Costa-Perez, “O-RAN: Disrupting the vir- tualized RAN ecosystem,”IEEE Communications Standards Magazine, vol. 5, no. 4, pp. 96–103, 2021

work page 2021

[20] [22]

Enabling the world’s first GPU-accelerated 5G Open RAN for NTT DOCOMO with NVIDIA Aerial,

NVIDIA, “Enabling the world’s first GPU-accelerated 5G Open RAN for NTT DOCOMO with NVIDIA Aerial,” 2023, accessed: 2025-12. [On- line]. Available: https://developer.nvidia.com/blog/enabling-the-worlds- first-gpu-accelerated-5g-open-ran-for-ntt-docomo-with-nvidia-aerial/

work page 2023

[21] [23]

Aerial CUDA-accelerated RAN release 25-2,

——, “Aerial CUDA-accelerated RAN release 25-2,” 2025, accessed: 2025-12. [Online]. Available: https://docs.nvidia.com/aerial/ cuda-accelerated-ran/25-2/aerial-cuda-accelerated-ran.pdf

work page 2025

[22] [24]

QCT VRB100 vRAN server specifications,

Quanta Cloud Technology, “QCT VRB100 vRAN server specifications,” 2024, vendor specifications; Accessed: 2025-12. [Online]. Available: https://www.qct.io/

work page 2024

[23] [25]

Verified reference configuration for virtualized RAN on the HPE ProLiant DL110,

Intel and Hewlett Packard Enterprise, “Verified reference configuration for virtualized RAN on the HPE ProLiant DL110,” 2022, accessed: 2025-

work page 2022

[24] [26]

[Online]. Available: https://builders.intel.com/docs/networkbuilders/ intel-hpe-verified-reference-configuration-for-virtualized-radio-access- networks-on-the-hpe-proliant-dl110-1653673153.pdf

work page

[25] [27]

Nvidia aerial gpu hosted ai-on-5g,

A. Kelkar and C. Dick, “Nvidia aerial gpu hosted ai-on-5g,” in2021 IEEE 4th 5G World Forum (5GWF). IEEE, 2021, pp. 64–69

work page 2021

[26] [28]

Understanding 5G Performance on Hetero- geneous Computing Architectures,

C. Wang, H. Nieet al., “Understanding 5G Performance on Hetero- geneous Computing Architectures,”IEEE Communications Magazine, vol. 63, no. 3, pp. 107–113, March 2025

work page 2025

[27] [29]

Intel accelerates 5G leadership with new products,

Intel Corporation, “Intel accelerates 5G leadership with new products,” 2023, accessed: 2025-12. [Online]. Available: https://www.intc.com/news-events/press-releases/detail/1606/ intel-accelerates-5g-leadership-with-new-products

work page 2023

[28] [30]

Report ITU-R M.2412-0: Guidelines for evaluation of radio interface technologies for IMT-2020,

ITU-R, “Report ITU-R M.2412-0: Guidelines for evaluation of radio interface technologies for IMT-2020,” International Telecommunication Union, Tech. Rep., Oct. 2017, dense Urban-eMBB; Accessed: 2025-06. [Online]. Available: https://www.itu.int/dms pub/itu-r/opb/rep/R-REP- M.2412-2017-PDF-E.pdf

work page 2020

[29] [31]

Recommendation ITU-R M.2160-0: Framework and overall objectives of the future development of IMT for 2030 and beyond,

——, “Recommendation ITU-R M.2160-0: Framework and overall objectives of the future development of IMT for 2030 and beyond,” International Telecommunication Union, Tech. Rep., Nov. 2023, accessed: 2025-06. [Online]. Available: https://www.itu.int/dms pubrec/ itu-r/rec/m/R-REC-M.2160-0-202311-I!!PDF-E.pdf

work page 2030

[30] [32]

Report ITU-R M.2410-0: Minimum requirements related to technical performance for IMT-2020 radio interface(s),

——, “Report ITU-R M.2410-0: Minimum requirements related to technical performance for IMT-2020 radio interface(s),” International Telecommunication Union, Tech. Rep., Nov. 2017, accessed: 2025-06. [Online]. Available: https://www.itu.int/pub/R-REP-M.2410

work page 2020

[31] [33]

A multi-source dataset of urban life in the city of Milan and the Province of Trentino,

G. Barlacchi, M. De Nadaiet al., “A multi-source dataset of urban life in the city of Milan and the Province of Trentino,”Scientific Data, vol. 2, p. 150055, 2015. [Online]. Available: https://doi.org/10.1038/sdata.2015.55

work page doi:10.1038/sdata.2015.55 2015

[32] [34]

Burstgpt: A real-world workload dataset to optimize llm serving systems,

Y . Wang, Y . Chenet al., “BurstGPT: A real-world workload dataset to optimize LLM serving systems,”arXiv preprint arXiv:2401.17644, 2024. [Online]. Available: https://arxiv.org/abs/2401.17644

work page arXiv 2024

[33] [35]

Claude vs. ChatGPT statistics 2026: Head-to-head numbers behind the AI battle,

R. A. Lee, “Claude vs. ChatGPT statistics 2026: Head-to-head numbers behind the AI battle,” 2025, 2.5–3B prompts/day, 190.6M DAU; Accessed: 2025-10. [Online]. Available: https://sqmagazine.co.uk/ claude-vs-chatgpt-statistics/

work page 2026

[34] [36]

Resident population and population density – municipality of Milan,

ISTAT, “Resident population and population density – municipality of Milan,” 2024, population density≈7 500/km 2; Accessed: 2025-06. [Online]. Available: https://www.istat.it/en/

work page 2024

[35] [37]

The state of mobile internet con- nectivity 2024,

GSMA Intelligence, “The state of mobile internet con- nectivity 2024,” 2024, accessed: 2025-06. [Online]. Avail- able: https://data.gsmaintelligence.com/research/research/research-2024/ the-state-of-mobile-internet-connectivity-2024

work page 2024

[36] [38]

A comprehensive analysis of the impact of an increase in user devices on the long-term energy efficiency of 5G networks,

J. Lorincz and Z. Klarin, “A comprehensive analysis of the impact of an increase in user devices on the long-term energy efficiency of 5G networks,”Smart Cities, vol. 7, no. 6, pp. 3616–3657, 2024. [Online]. Available: https://www.mdpi.com/2624-6511/7/6/140

work page 2024

[37] [39]

Preliminary draft new report ITU-R M.[IMT- 2030.TECH PERF REQ]: Minimum requirements related to technical performance for IMT-2030 radio interface(s),

ITU-R WP 5D, “Preliminary draft new report ITU-R M.[IMT- 2030.TECH PERF REQ]: Minimum requirements related to technical performance for IMT-2030 radio interface(s),” International Telecommu- nication Union, Tech. Rep., 2024, working document; Accessed: 2025- 06

work page 2030

[38] [40]

TS 38.214: NR; physical layer procedures for data (release 18),

3GPP, “TS 38.214: NR; physical layer procedures for data (release 18),” 3rd Generation Partnership Project, Tech. Rep., 2024, accessed: 2025-06. [Online]. Available: https://www.3gpp.org/dynareport/38214.htm

work page 2024

[39] [41]

2024 global data center sur- vey: Report,

Uptime Institute, “2024 global data center sur- vey: Report,” 2024, accessed: 2025-06. [Online]. Avail- able: https://datacenter.uptimeinstitute.com/rs/711-RIA-145/images/ 2024.GlobalDataCenterSurvey.Report.pdf

work page 2024

[40] [42]

Electricity price statistics,

Eurostat, “Electricity price statistics,” 2024, household and industrial prices; 0.3291 EUR/kWh; Accessed: 2025-06. [Online]. Avail- able: https://ec.europa.eu/eurostat/statistics-explained/index.php?title= Electricity price statistics

work page 2024

[41] [43]

Ericsson mobility report,

Ericsson, “Ericsson mobility report,” Tech. Rep., November 2025, accessed: February 23, 2026. [Online]. Available: https://www.ericsson.com/4aca6f/assets/local/reports-papers/mobility- report/documents/2025/ericsson-mobility-report-november-2025.pdf

work page 2025

[42] [44]

LLM performance leaderboard: Model and API provider benchmarks,

Artificial Analysis, “LLM performance leaderboard: Model and API provider benchmarks,” 2024, accessed: 2025-06. [Online]. Available: https://artificialanalysis.ai/leaderboards/models

work page 2024

[43] [45]

Performance characterization of expert router for scalable LLM inference,

J. Pichlmeier, P. Ross, and A. Luckow, “Performance characterization of expert router for scalable LLM inference,” inarXiv preprint arXiv:2404.15153, 2024. [Online]. Available: https://arxiv.org/abs/ 2404.15153

work page arXiv 2024

[44] [46]

Together AI – inference pricing,

Together AI, “Together AI – inference pricing,” 2024, llama 3.3 70B: $0.88/Mtok; Accessed: 2025-06. [Online]. Available: https: //www.together.ai/pricing

work page 2024

[45] [47]

Generative AI Practices, Literacy, and Divides: An Empirical Analysis in the Italian Context

B. Savoldi, G. Attanasioet al., “Generative ai practices, literacy, and divides: An empirical analysis in the italian context,” 2025. [Online]. Available: https://arxiv.org/abs/2512.03671

work page internal anchor Pith review arXiv 2025