archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 17

cs.DC 2026-02-14 reviewed

Probe-first scheduler holds control overhead near constant in GPU clusters
Laminar: A Probe-First Scheduling Paradigm with Deterministic Runtime Survival

Zhengyan Chu
cs.DC 2026-02-14 reviewed

Benchmark scores LLM Azure SDK code without running it
ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness

Wenxing Zhu +9
cs.LG 2026-02-13 reviewed

Mismatched weights suppress higher ranks in federated LoRA
Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

Fei Wu +3
cs.CR 2026-02-12 reviewed

HE operations run 334 times faster on PIM hardware
DRAMatic Speedup: Accelerating HE Operations on a Processing-in-Memory System

Niklas Klinger +4
cs.CR 2026-02-12 reviewed

Narrower overrides contain exploits as well as broad ones
Legitimate Overrides in Decentralized Protocols

Oghenekaro Elem +1
cs.DC 2026-02-12 reviewed

Adaptive model deployments speed LLM serving 1.5x on average
OServe: Accelerating LLM Serving via Spatial-Temporal Workload Orchestration

Youhe Jiang +4
cs.DC 2026-02-11 reviewed

StreamServe is a new system for running large language models that splits input…
StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving

Satyam Kumar +3
cs.SI 2026-02-11 reviewed

626 autonomous AI agents form emergent social networks on their own
Emergent Social Structures in Autonomous AI Agent Networks: A Metadata Analysis of 626 Agents on the Pilot Protocol

Teodor-Ioan Calin
cs.DC 2026-02-11 reviewed

The paper describes an integrated methodology combining hardware modeling
Interferences within a certifiable design methodology for high-performance multi-core platforms

Mohamed Amine Khelassi (LECA) +11
cs.DC 2026-02-11 reviewed

VTC introduces virtual tensors in DNN compilation to track data movement via index…
VTC: DNN Compilation with Virtual Tensors for Data Movement Elimination

Muyan Hu +9
cs.LG 2026-02-10 reviewed

M3 Ultra hits 22.7 FPS real-time diffusion img2img
Systematic Optimization of Real-Time Diffusion Model Inference on Apple M3 Ultra

Yoichi Ochiai
cs.DC 2026-02-10 reviewed

Benchmark standardizes speculative decoding across realistic loads
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding

Talor Abramovich +7
cs.DC 2026-02-10 reviewed

This paper describes Para-B&B
Para-B&B: Load-Balanced Deterministic Parallelization of Solving MIP

Jinyu Zhang +5
cs.DC 2026-02-10 reviewed

Video codecs cut remote KV cache TTFT by 3.5x for LLMs
Efficient Remote KV Cache Reuse with GPU-native Video Codec

Liang Mi +5
cs.LG 2026-02-10 reviewed

Three Rashomon sets formalize model multiplicity in federated learning
Rashomon Sets and Model Multiplicity in Federated Learning

Xenia Heilmann +2
cs.LG 2026-02-09 reviewed

WebGPU dispatch overhead is 24-36 μs on Vulkan
Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers

J\k{e}drzej Maczan
cs.OS 2026-02-09 reviewed

Equilibria enforces CXL fairness and raises performance 52 percent
Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

Kaiyang Zhao +9
cs.DC 2026-02-09 reviewed

Original papers outperform tutorials for system design mastery
The Computer System Trail

Sushant Kumar Gupta
cs.PL 2026-02-06 reviewed

Grassroots logic programs get correct deterministic multiagent form
Implementing Grassroots Logic Programs with Multiagent Transition Systems and AI (Full Version)

Ehud Shapiro
cs.CR 2026-02-06 reviewed

Wonderboom aggregates million Ethereum signatures in one slot
Wonderboom -- Efficient, and Censorship-Resilient Signature Aggregation for Million Scale Consensus

Zeta Avarikioti +3
math.OC 2026-02-05 reviewed

GPU kernels solve stochastic optimization for over a million scenarios
From Sequential to Parallel: Reformulating Dynamic Programming as GPU Kernels for Large-Scale Stochastic Combinatorial Optimization

Jingyi Zhao +4
cs.OS 2026-02-04 reviewed

Host RAM enables single-GPU training of 120B LLMs
Horizon-LM: A RAM-Centric Architecture for LLM Training

Zhengqing Yuan +2
cs.MA 2026-02-04 reviewed

Multi-agent system recovers better from smart contract audit failures
SPEAR: An Engineering Case Study of Multi-Agent Coordination for Smart Contract Auditing

Indraveni Chebolu +2
cs.LG 2026-02-04 reviewed

XaaS cuts edge AI explanation latency by 38 percent
Scalable Explainability-as-a-Service (XaaS) for Edge AI Systems

Samaresh Kumar Singh +1
cs.DC 2026-01-30 reviewed

Epoch events resolve duelling admins in CRDT groups
ERA: Epoch-Resolved Arbitration for Duelling Admins in Group Management CRDTs

Kegan Dougal
cs.DC 2026-01-30 reviewed

Data centers offset carbon by supplying grid regulation
Coordinating GPU Data Centers and Power Grid Regulation Service for Exogenous Carbon Benefits

Ali Jahanshahi +4
cs.LG 2026-01-29 reviewed

Stored updates remove partial-participation bias from federated training
FedAdaVR: Adaptive Variance Reduction for Robust Federated Learning under Limited Client Participation

S M Ruhul Kabir Howlader +3
cs.AI 2026-01-29 reviewed

Centralized critic beats decentralized critics in LLM collaboration
Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic

Shuo Liu +3
cs.DC 2026-01-29 reviewed

Primary access hints speed Ethereum replay 25x
Ira: Efficient Transaction Replay for Distributed Systems

Adithya Bhat +2
cs.DC 2026-01-29 reviewed

ZipMoE cuts MoE latency up to 73% on edge devices
ZipMoE: Efficient On-Device MoE Serving via Lossless Compression and Cache-Affinity Scheduling

Yuchen Yang +4
cs.AR 2026-01-28 reviewed

First NPU designed for diffusion language model inference
NPU Design for Diffusion Language Model Inference

Binglei Lou +11
cs.DC 2026-01-28 reviewed

Chunk scheduling overlaps compute and comms inside one kernel
Syncopate: Efficient Multi-GPU AI Kernels via Automatic Chunk-Centric Compute-Communication Overlap

Xinwei Qiang +5
cs.DC 2026-01-28 reviewed

Rotary scheduler raises LLM TTFT SLO rates by 75% on superchips
SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips

Jiahuan Yu +3
cs.CR 2026-01-27 reviewed

eIDAS 2.0 can evolve to support Self-Sovereign Identity
Self-Sovereign Identity and eIDAS 2.0: An Analysis of Control, Privacy, and Legal Implications

Nacereddine Sitouah +2
cs.DC 2026-01-23 reviewed

Edge AI framework surpasses IPW=1.0 on quantized LLM
QEIL v2: Heterogeneous Computing for Edge Intelligence via Roofline-Derived Pareto-Optimal Energy Modeling and Multi-Objective Orchestration

Satyam Kumar +1
cs.DC 2026-01-22 reviewed

Space-filling curves simplify fast matrix multiplication
Space Filling Curves is All You Need: Communication-Avoiding Matrix Multiplication Made Simple

Evangelos Georganas +2
cs.LG 2026-01-21 reviewed

Fitness score ranks IoT subnets in 20 seconds
DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training

Bostan Khan +1
cs.DC 2026-01-20 reviewed

PyTorch library unifies differentiable sparse solvers across backends
torch-sla: Differentiable Sparse Linear Algebra with Adjoint Solvers and Sparse Tensor Parallelism for PyTorch

Mingyuan Chi +1
cs.DC 2026-01-19 reviewed

Co-design lets agentic LLMs handle 77% more load at same latency
Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference

Anish Biswas +7
cs.DC 2026-01-15 reviewed

Beta metric delivers 96.5% optimal edge AI performance
Mitigating GIL Bottlenecks in Edge AI Systems

Mridankan Mandal +1
cs.DC 2026-01-15 reviewed

WISP boosts distributed LLM capacity up to 4.1x
WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

Xiangchen Li +9
math.NA 2026-01-15 reviewed

Oblique projection preserves symmetry in pseudo-Hermitian eigensolves
Chebyshev Accelerated Subspace Eigensolver for Pseudo-hermitian Hamiltonians

Edoardo Di Napoli (1) +4
math.OC 2026-01-12 reviewed

Multi-GPU framework scales PDHG to massive linear programs
D-PDLP: Scaling PDLP to Distributed Multi-GPU Systems

Hongpei Li +4
cs.DC 2026-01-09 reviewed

Style transfer and prompts boost federated domain generalization
Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization

Yuliang Chen +7
cs.DC 2026-01-09 reviewed

Three-layer memory system lifts distributed AI speed and efficiency
Self-Evolving Distributed Memory Architecture for Scalable AI Systems

Zixuan Li +2
cs.DC 2026-01-09 reviewed

Self-evolving memory architecture reaches 87% utilization in distributed AI
Self-Evolving Distributed Memory Architecture for Scalable AI Systems

Zixuan Li +2
cs.CR 2026-01-06 reviewed

Blockchains must share data across chains for complex uses
Exploring Blockchain Interoperability: Frameworks, Use Cases, and Future Challenges

Stanly Wilson +5
cs.NI 2026-01-05 reviewed

Oblivious routing cannot beat √(2k)/4 load on sparse tori
Optimal Oblivious Load-Balancing for Sparse Traffic in Large-Scale Satellite Networks

Rudrapatna Vallabh Ramakanth +1
cs.DC 2026-01-02 reviewed

GCP 23% faster on retail POS workloads but Azure 72% cheaper
Cost-Performance Analysis of Cloud-Based Retail Point-of-Sale Systems: A Comparative Study of Google Cloud Platform and Microsoft Azure

Ravi Teja Pagidoju
cs.CR 2026-01-01 reviewed

Consensus protocol secures multi-client data until unanimous agreement
Secure, Verifiable, and Scalable Multi-Client Data Sharing via Consensus-Based Privacy-Preserving Data Distribution

Prajwal Panth +1