archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 18

cs.DC 2025-12-29 reviewed

Bitcoin subnets stake in L1 BTC to cut tx cost by 23x
Bitcoin-IPC Whitepaper: Scaling Bitcoin with a Network of Proof-of-Stake Subnets

Marko Vukoli\'c +5
cs.DC 2025-12-26 reviewed

Graph-guided LLM fixes cloud incidents with 6x accuracy
PRAXIS: Integrating Program Analysis with Observability for Root-Cause Analysis

Shengkun Cui +3
cs.LO 2025-12-24 reviewed

Modal logic axioms capture distributed protocols
Declarative distributed algorithms as axiomatic theories in three-valued modal logic over semitopologies

Murdoch J. Gabbay
cs.DC 2025-12-24 reviewed

GPU data structure speeds hypergraph triad counting up to 473x
ESCHER: Efficient and Scalable Hypergraph Evolution Representation with Application to Triad Counting

S. M. Shovan +3
cs.DC 2025-12-23 reviewed

SHIRO delivers 221x SpMM speedup on 128 GPUs via sparsity-aware transfers
SHIRO: Near-Optimal Communication Strategies for Distributed Sparse Matrix Multiplication

Chen Zhuang +7
cs.RO 2025-12-22 reviewed

ROS 2 real-time support detailed in survey of analyses and enhancements
A Survey of Real-Time Support, Analysis, and Advancements in ROS 2

Daniel Casini +4
cs.DC 2025-12-22 reviewed

Length groups cut LLM latency by up to 67%
CascadeInfer: Length-Aware Scheduling of LLM Serving with Low Latency and Load Balancing

Yitao Yuan (1 +7
cs.DC 2025-12-22 reviewed

Uncertainty scores select compatible peers in decentralized learning
Evidential Trust-Aware Model Personalization in Decentralized Federated Learning for Wearable IoT

Murtaza Rangwala +2
cs.DC 2025-12-18 reviewed

Tool spots bit-flip faults in LLMs for fast fixes
BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMs

Muhammad Zeeshan Karamat +2
cs.DC 2025-12-18 reviewed

Federated platform runs full AI lifecycle in open science cloud
AI4EOSC: a Federated Cloud Platform for Artificial Intelligence in Scientific Research

Ignacio Heredia +30
cs.DC 2025-12-18 reviewed

Multipath routing lifts host-GPU bandwidth 4.6x
MultiPath Memory Access: Breaking Host-GPU Bandwidth Bottlenecks in LLM Services

Lingfeng Tang +8
cs.DC 2025-12-17 reviewed

TileLoom matches vendor libraries on spatial accelerators
TileLoom: Automatic Dataflow Planning for Tile-Based Languages on Spatial Dataflow Accelerators

Wei Li +8
cs.DC 2025-12-17 reviewed

Data movement bottlenecks sit outside the network core
Reexamining Paradigms of End-to-End Data Movement

Chin Fang +3
cs.LG 2025-12-16 reviewed

Automated planner boosts any-to-any model goodput up to 6x
Cornfigurator: Automated Planning for Any-to-Any Multimodal Model Serving

Jeff J. Ma +7
cs.DC 2025-12-15 reviewed

Framework links SKA imaging quality to energy and cost metrics
astroCAMP: A Community Benchmark and Co-Design Framework for Sustainable SKA-Scale Radio Imaging

Denisa-Andreea Constantinescu +9
cs.DC 2025-12-15 reviewed

Disaggregating attention and experts yields 4.7x MoE inference speedup
Janus: Disaggregating Attention and Experts for Scalable MoE Inference

Zhexiang Zhang +12
cs.DC 2025-12-13 reviewed

HetRL raises LLM RL throughput up to 9x on mixed GPUs
HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments

Yongjun He +7
cs.DC 2025-12-13 reviewed

Edge devices train large models at cloud speeds via GEMM asymmetry
On Harnessing Idle Compute at the Edge for Foundation Model Training

Leyang Xue +5
cs.SE 2025-12-13 reviewed

Async Kafka rules shift availability forecasts by 0.001 points or less
Evaluating Asynchronous Semantics in Trace-Discovered Resilience Models: A Case Study on the OpenTelemetry Demo

Anatoly A. Krasnovsky
cs.DC 2025-12-13 reviewed

DRL model resolves conflicts in computing continuum resources
A Conflict-Aware Resource Management Framework for the Computing Continuum

Vlad Popescu-Vifor +3
cs.LG 2025-12-13 reviewed

Low-rank LLMs train up to 2.27x faster with new parallelism
BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models

Zhengyang Wang +7
cs.CR 2025-12-11 reviewed

Hybrid noise keeps ML models at 80 percent accuracy on private health data
Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud Systems

N Mangala +7
cs.DC 2025-12-10 reviewed

SynthPix streams synthetic PIV images on demand at accelerator speed
SynthPix: A lightspeed PIV image generator

Antonio Terpin +3
cs.DC 2025-12-10 reviewed

Local build lets spiking networks scale to thousands of GPUs
Scalable Construction of Spiking Neural Networks using up to thousands of GPUs

Bruno Golosio +12
cs.DC 2025-12-10 reviewed

Prewarming multiple LLMs cuts tail TTFT by 50x
WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving

Chiheng Lou +7
cs.DC 2025-12-10 reviewed

BatANN passes full query state to scale vector search
Passing the Baton: High Throughput Distributed Disk-Based Vector Search with BatANN

Nam Anh Dang (1) +2
cs.LG 2025-12-10 reviewed

SHARe-KAN cuts KAN head storage 9.3X at 2-point accuracy cost
SHARe-KAN: Post-Training Vector Quantization for Cache-Resident KAN Inference

Jeff Smith
cs.DC 2025-12-08 reviewed

Bi-level search finds ML shifts that quadruple VM allocation loss
A Performance Analyzer for a Public Cloud's ML-Augmented VM Allocator

Roozbeh Bostandoost +10
cs.DC 2025-12-06 reviewed

Vector LUT speeds parallel ultra-low-bit LLM inference up to 4.2×
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Xiangyu Li +5
cs.DC 2025-12-05 reviewed

Gradual model growth lets limited clients contribute in federated learning
Breaking the Capacity Bottleneck in Model-Heterogeneous Federated Learning via Gradual Model Restoration

Chengjie Ma +3
cs.DC 2025-11-27 reviewed

Tokenized context speeds edge LLM responses by up to 14%
DisCEdge: Distributed Context Management for Large Language Models at the Edge

Mohammadreza Malekabbasi +2
cs.DC 2025-11-26 reviewed

Diagonal scaling cuts database p95 latency by up to 40%
Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases

Shahir Abdullah +1
cs.DC 2025-11-25 reviewed

Voxel traits let Spira skip kernel-map overhead for 3x faster point-cloud convolution
Spira: Exploiting Voxel Data Structural Properties for Efficient Sparse Convolution in Point Cloud Networks

Dionysios Adamopoulos +3
cs.MA 2025-11-21 reviewed

LLM agents give PyTorch 2.88x speedup on H100 GPUs
Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems

Kirill Nagaitsev +3
cs.AR 2025-11-19 reviewed

Joint data-compute tuning speeds ML kernels on PIM up to 13x
DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures

Peiming Yang +6
cs.LG 2025-11-18 reviewed

Adaptive reputation defends federated learning from malicious clients
FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning

Abolfazl Younesi +4
cs.DC 2025-11-18 reviewed

Seer speeds LLM RL rollouts up to 2x by learning prompt output patterns
Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

Ruoyu Qin +9
cs.DC 2025-11-17 reviewed

Regression attributes node energy to individual processes
Learning Process Energy Profiles from Node-Level Power Data

Jonathan Bader +6
cs.LG 2025-11-13 reviewed

Satellite system reduces imagery latency from 51 to 21 minutes
EarthSight: A Distributed Framework for Low-Latency Satellite Intelligence

Ansel Kaplan Erol +2
cs.DC 2025-11-13 reviewed

Thermal imbalance creates stragglers that slow multi-GPU nodes
Lit Silicon: A Case Where Thermal Imbalance Couples Concurrent Execution in Multiple GPUs

Marco Kurzynski +2
cs.DC 2025-11-12 reviewed

NVRAR cuts multi-node LLM latency up to 3.6x
Understanding and Improving Communication Performance in Multi-node LLM Inference

Prajwal Singhania +6
cs.DC 2025-11-12 reviewed

SpaDA expresses parallel patterns in 14x fewer lines
SpaDA: A Spatial Dataflow Architecture Programming Language

Lukas Gianinazzi +2
cs.DC 2025-11-11 reviewed

Local models handle 88.7% of queries at higher intelligence per watt
Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

Jon Saad-Falcon +14
cs.DC 2025-11-10 reviewed

DMA offloads close 4.5x gap for latency-bound ML collectives
DMA-Latte: Expanding the Reach of DMA Offloads to Latency-bound ML Communication

Suchita Pati +5
physics.comp-ph 2025-11-06 reviewed

Domain decomposition scales Monte Carlo to 16384 cores
Scalable Domain-decomposed Monte Carlo Neutral Transport for Nuclear Fusion

Oskar Lappi +5
cs.DC 2025-11-05 reviewed

Spectral map decides solvability of colorless tasks
Stone Duality Proofs for Colorless Distributed Computability Theorems

Cameron Calk +1
quant-ph 2025-11-05 reviewed

Simulator reaches 50-qubit universal quantum runs on exascale machine
Universal Quantum Computer Simulation of 50 Qubits on Europe`s First Exascale Supercomputer Harnessing Its Heterogeneous CPU-GPU Architecture

Hans De Raedt +7
cs.DC 2025-11-05 reviewed

Unified layout cuts LLM decode time on edge NPUs by up to 3x
UMDAM: A Unified Data Layout and DRAM Address Mapping for Heterogenous NPU-PIM

Hai Huang
cs.DC 2025-11-05 reviewed

Essential agents split global platforms into four classes
Characterising Global Platforms: Centralised, Decentralised, Federated, and Grassroots

Ehud Shapiro
cs.AI 2025-11-05 reviewed

SnapStream cuts KV cache memory by 4x for 128k LLM inference
SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators

Jonathan Li +21