The rising costs of training frontier AI models,

· 2024 · arXiv 2405.21015

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

citation-role summary

background 2

citation-polarity summary

background 1 unclear 1

representative citing papers

AI Sovereignty: A Qualitative Model of Strategic Competition as AI Becomes an Instrument of National Power

cs.CY · 2026-06-05 · unverdicted · novelty 6.0

The authors introduce definitions and a qualitative model of AI sovereignty that identifies multi-scale contributors and leverage points nations can target through kinetic and non-kinetic actions to influence AI-driven national power.

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Rescaled ASGD recovers convergence to the true global objective by rescaling worker stepsizes proportional to computation times, matching the known time lower bound in the leading term under non-convex smoothness and bounded heterogeneity.

Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification

cs.AI · 2026-04-18 · unverdicted · novelty 6.0

Cross-model semantic disagreement adds an epistemic uncertainty term that improves total uncertainty estimation over self-consistency alone, helping flag confident errors in LLMs.

Switching Efficiency: A Novel Framework for Dissecting AI Data Center Network Efficiency

cs.NI · 2026-04-16 · unverdicted · novelty 6.0

Introduces Switching Efficiency (η) decomposed into data, routing efficiency, and port utilization factors to analyze and improve communication bottlenecks in AI data center networks for LLM training.

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.

FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation

cs.CR · 2026-03-10 · unverdicted · novelty 6.0

FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.

Spectral structural distortion reveals redundant neurons in neural networks

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

A graph-spectral importance score based on layer-wise structural distortion between pre- and post-activation neuron graphs identifies removable neurons for iterative pruning without intermediate updates, followed by recovery fine-tuning.

Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP

cs.DC · 2026-05-08 · unverdicted · novelty 5.0

FCP shards sequences at block level with flexible P2P communication and bin-packing to achieve near-linear scaling up to 256 GPUs and 1.13x-2.21x higher attention MFU in foundation model pre-training.

Who Prices Cognitive Labor in the Age of Agents? Compute-Anchored Wages

cs.AI · 2026-05-07 · unverdicted · novelty 5.0 · 2 refs

AI agents convert compute capital into cognitive labor units, so on substitutable tasks the competitive human wage is bounded above by relative productivity times compute intensity times the rental rate of compute.

Physics Priors Offer Useful Accuracy-Carbon Trade-Offs in Spatio-Temporal Forecasting

cs.LG · 2025-09-29 · unverdicted · novelty 5.0

Stronger physics priors in neural networks for spatio-temporal shear flow forecasting yield substantially lower training carbon footprints than weak or no priors, though inference savings are less consistent.

From Cradle to Cloud: A Life Cycle Review of AI's Environmental Footprint

cs.CY · 2026-05-06 · unverdicted · novelty 4.0

A review of AI sustainability studies finds inconsistent life cycle definitions and predominant reliance on coarse CO2e proxies, with limited coverage of water, materials, and multi-impact assessments.

The End of the Foundation Model Era: Open-Weight Models, Sovereign AI, and Inference as Infrastructure

cs.CY · 2026-03-18 · unverdicted · novelty 3.0

Open-weight models have ended the foundation model era by eliminating pre-training as a durable moat and enabling sovereign AI control through direct access to model weights.

LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems

cs.LG · 2026-01-20 · unverdicted · novelty 3.0

A survey taxonomy of LLMs identifies three scaling crises and six efficiency paradigms while tracing the shift from generation to tool-using agents.

Towards EnergyGPT: A Large Language Model Specialized for the Energy Sector

cs.CL · 2025-09-08 · unverdicted · novelty 3.0

Fine-tuned LLaMA 3.1-8B variants for the energy sector outperform the base model on domain QA benchmarks, with LoRA delivering similar gains at lower training cost.

citing papers explorer

Showing 1 of 1 citing paper after filters.

FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation cs.CR · 2026-03-10 · unverdicted · none · ref 18
FlexServe achieves up to 10x faster time-to-first-token for secure LLM inference on mobile devices by using flexible resource isolation in TrustZone compared to standard approaches.

The rising costs of training frontier AI models,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer