archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 19

cs.DC 2025-11-04 reviewed

Simpler recovery fixes leaderless Paxos and generalizes it optimally
Making Democracy Work: Fixing and Simplifying Egalitarian Paxos (Extended Version)

Fedor Ryabinin +2
cs.LG 2025-11-03 reviewed

Neuromorphic chip learns new classes with 113x lower latency
Online Continual Learning on Intel Loihi 2 via a Co-designed Spiking Neural Network

Elvin Hajizada +7
cs.DC 2025-11-03 reviewed

Stable errors bound local clock skew by O(Δ + δ log D)
Gradient Clock Synchronization with Practically Constant Local Skew

Christoph Lenzen
cs.DC 2025-10-31 reviewed

Uniform RDMA WriteImm interface reaches 400 Gbps on NVIDIA and AWS NICs
fabric-lib: RDMA Point-to-Point Communication for LLM Systems

Nandor Licker (1) +3
cs.DC 2025-10-31 reviewed

kNN predicts good sub-system sizes for GPU tridiagonal partition
ML-Based Optimum Sub-system Size Heuristic for the GPU Implementation of the Tridiagonal Partition Method

Milena Veneva
cs.AI 2025-10-31 reviewed

AI matches human experts in designing LLM cluster algorithms
Glia: A Human-Inspired AI for Automated Systems Design and Optimization

Pouya Hamadanian +7
cs.LG 2025-10-25 reviewed

Dictator clients erase honest contributions in federated learning
Power to the Clients: Federated Learning in a Dictatorship Setting

Mohammadsajad Alipour +1
cs.DC 2025-10-24 reviewed

Quasipolylog rounds for (Δ+1)-coloring when neighborhood independence is bounded
Distributed $(\Delta+1)$-Coloring in Graphs of Bounded Neighborhood Independence

Marc Fuchs +1
cs.NI 2025-10-22 reviewed

SWOT cuts collective communication time up to 89.7% by overlapping reconfiguration
Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks

Changbo Wu +3
cs.DC 2025-10-22 reviewed

Spot GPUs raise LLM RL throughput 1.5-2x at 28-49% lower cost
RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs

Yongji Wu +7
cs.CL 2025-10-21 reviewed

Sparse attention trains 512K-context LLMs at 6x speed
MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

Wenxuan Li +5
cs.DC 2025-10-21 reviewed

TokenCake trims multi-agent LLM latency over 47% with smart KV cache moves
TokenCake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications

Zhuohang Bian +4
cs.NI 2025-10-20 reviewed

New algorithm sustains node activity to cut broadcast latency
A New Broadcast Model for Several Network Topologies

Hongbo Lu +5
cs.DC 2025-10-17 reviewed

Statistical method quantifies probabilistic training time guarantees at scale
PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training

Alicia Golden +6
cs.DC 2025-10-15 reviewed

Decode rescheduling cuts LLM P99 TPOT by 75%
STAR: Decode-Phase Rescheduling for LLM Inference

Zhibin Wang +10
cs.DC 2025-10-13 reviewed

FlexPipe cuts reserved GPUs for LLM serving from 75% to 30% of peak
FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless Clusters

Yanying Lin +4
cs.LG 2025-10-09 reviewed

Layered prefill slashes MoE TTFT by 70% without stalls
From Tokens to Layers: Redefining Stall-Free Scheduling for MoE Serving with Layered Prefill

Gunjun Lee +4
cs.LG 2025-10-09 reviewed

Sketches cut communication waste in Byzantine DFL
SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

Murtaza Rangwala +3
cs.DC 2025-10-08 reviewed

Framework measures real makespans from abstract graphs on CPU-GPU-FPGA hardware
Evaluating Rapid Makespan Predictions for Heterogeneous Systems with Programmable Logic

Martin Wilhelm +3
cs.DC 2025-10-07 reviewed

Async algorithm takes consistent snapshots with O(n) messages
Asynchronous Checkpoint for Eventually Consistent Databases

Raaghav Ravishankar +2
cs.LG 2025-10-07 reviewed

Fused models win on long-range atomistic properties
When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning

Arindam Chowdhury +1
cs.DC 2025-10-07 reviewed

Profiling uncovers patterns that speed up large MoE inference 6.6x
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference

Zhongkai Yu +8
cs.AI 2025-10-05 reviewed

Speculative predictions cut agent latency up to 20 percent
Speculative Actions: A Lossless Framework for Faster Agentic Systems

Naimeng Ye +5
cs.DC 2025-10-05 reviewed

Canonical rounds block optimal Byzantine consensus
Why Canonical Rounds Fail for Optimal Byzantine Resilience

Hagit Attiya +2
cs.DC 2025-10-03 reviewed

GPU data-movement cuts lower both time and energy for large sparse solves
On the energy efficiency of sparse matrix computations on multi-GPU clusters

Massimo Bernaschi +3
cs.DC 2025-09-29 reviewed

GRACE-MoE speeds up distributed MoE inference up to 4.66x
GRACE-MoE: Grouping and Replication with Locality-Aware Routing for Efficient Distributed MoE Inference

Yu Han +6
cs.DC 2025-09-29 reviewed

Harp speeds heterogeneous GPU training by 1.3x-1.6x
HARP: Orchestrating Automated Parallel Training on Heterogeneous GPU Clusters

Antian Liang +8
cs.CR 2025-09-29 reviewed

Graph model spots silent smart-grid eavesdroppers at 98% accuracy
Federated Spatiotemporal Graph Learning for Passive Attack Detection in Smart Grids

Bochra Al Agha +1
cs.CR 2025-09-29 reviewed

Simulator detects TON contract race conditions missed by static checks
BugMagnifier: TON Transaction Simulator for Revealing Smart Contract Vulnerabilities

Yury Yanovich +8
cs.DC 2025-09-27 reviewed

132k FaaS workflow runs on AWS and Azure show scaling and cost patterns
Characterizing FaaS Workflows on Public Clouds: The Good, the Bad and the Ugly

Varad Kulkarni +5
cs.DC 2025-09-25 reviewed

Modular bricks cut multimodal AI energy by 42% on small batteries
Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices

Yilong Li +7
cs.DC 2025-09-25 reviewed

Elastic PP achieves 1.69x speedup for long-context LLM training
InfiniPipe: Elastic Pipeline Parallelism for Efficient Variable-Length Long-Context LLM Training

Shiju Wang +7
cs.LG 2025-09-24 reviewed

Frontier AI models average 0.34 Wh per query on real hardware
Energy Use of AI Inference, Efficiency Pathways, and Test-Time Scaling

Felipe Oviedo +7
cs.DC 2025-09-24 reviewed

Dynamic TP changes boost LLM inference throughput 1.75x-6.57x
Amoeba: Runtime Tensor Parallel Transformation for LLM Inference Services

Haoyu Chen +5
cs.LG 2025-09-22 reviewed

Metric picks key workers to steady swarm learning on uneven data
Multi-Worker Selection based Distributed Swarm Learning for Edge IoT with Non-i.i.d. Data

Zhuoyu Yao +5
q-bio.NC 2025-09-16 reviewed

Tuned whole-brain model gains alpha rhythms and complexity
Emergent complexity and rhythms in evoked and spontaneous dynamics of human whole-brain models after tuning through analysis tools

Gianluca Gaglioti +12
cs.CR 2025-09-13 reviewed

TON checklist built from 233 real audit findings
From Paradigm Shift to Audit Rift: Empirical Analysis and Validation of Security Audit Methodologies for Asynchronous Smart Contract Systems

Yury Yanovich +8
physics.comp-ph 2025-09-10 reviewed

Single code base runs radiation hydrodynamics on any hardware scale
HARD: A Performance Portable Radiation Hydrodynamics Code based on FleCSI Framework

Julien Loiseau +9
cs.DC 2025-09-09 reviewed

Dual-phase expert scheduling cuts MoE LLM latency up to 7.55x
DuoServe-MoE: Dual-Phase Expert Prefetch and Caching for LLM Inference QoS Assurance

Yuning Zhang +4
cs.LG 2025-09-03 reviewed

DP training gains 2.21x throughput with dynamic layer quantization
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling

Yubo Gao +3
cs.DC 2025-08-29 reviewed

Chameleon recovers training within 11% of normal speed after faults
Chameleon: Adaptive Fault Tolerance for Distributed Training via Real-time Policy Selection

Yuhang Zhou +14
quant-ph 2025-08-26 reviewed

Resource estimates find feasible setups for distributed quantum computers
Architecting Distributed Quantum Computers: Design Insights from Resource Estimation

Dmitry Filippov +2
cs.DC 2025-08-22 reviewed

Default collectives up to 5x slower than tuned choices
PICO: Performance Insights for Collective Operations

Saverio Pasqualoni +5
cs.DC 2025-08-21 reviewed

HFX raises LLM SLO attainment 4.44x with joint scheduling and scaling
HFX: Joint Design of Algorithms and Systems for Multi-SLO Serving and Fast Scaling

Zahra Yousefijamarani +14
cs.DC 2025-08-21 reviewed

CausalMesh keeps caches consistent during client migrations
CausalMesh: A Formally Verified Causally Consistent Distributed Cache with Support for Client Migration

Haoran Zhang +4
cs.DC 2025-08-21 reviewed

Engine cuts mixed-precision LLM latency by up to 61 percent
LMDeploy Accelerates Mixed-Precision LLM Inference with TurboMind

Li Zhang +8
cs.LG 2025-08-20 reviewed

Client KMeans filtering yields near-IID results in federated distillation
Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data

Ahmed Mujtaba +3
cs.CC 2025-08-19 reviewed

Production control alone realizes any polynomial analog dynamics
Analog computation with transcriptional networks

David Doty +2
cs.DC 2025-08-18 reviewed

Expert placement strategy cuts MoE edge latency up to 30%
Accelerating Edge Inference for Distributed MoE Models with Latency-Optimized Expert Placement

Tian Wu +7
cs.DC 2025-08-11 reviewed

OPEN predicts GPU performance at 98% accuracy with minimal profiling
Coordinated Power Management on Heterogeneous Systems

Zhong Zheng +4