pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 13

  1. cs.AR 2026-04-13 reviewed
    Decoupled matrix units deliver up to 2.31x AI speedups on CPUs

    CUTEv2: Unified and Configurable Matrix Extension for Diverse CPU Architectures with Minimal Design Overhead

    Jinpeng Ye +13

  2. cs.DC 2026-04-13 reviewed
    Self-calibrating digital twin reaches 4.39% MAPE on datacenter predictions

    OpenDT: Exploring Datacenter Performance and Sustainability with a Self-Calibrating Digital Twin

    Radu Nicolae +4

  3. cs.DC 2026-04-13 reviewed
    HPC fabrics show distinct congestion under AI-like bursts

    Characterizing the Impact of Congestion in Modern HPC Interconnects

    Lorenzo Piarulli +9

  4. cs.LG 2026-04-13 reviewed
    Pipeline compresses federated models over 11 times for 60% faster training

    A Full Compression Pipeline for Green Federated Learning in Communication-Constrained Environments

    Elouan Colybes +2

  5. cs.DC 2026-04-13 reviewed
    Hierarchical search tunes GPU apps better and faster

    Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search

    Daniel Nichols +5

  6. cs.DC 2026-04-13 reviewed
    Proactive DQN scaling outperforms reactive Kubernetes autoscalers

    NimbusGuard: A Novel Framework for Proactive Kubernetes Autoscaling Using Deep Q-Networks

    Chamath Wanigasooriya +1

  7. quant-ph 2026-04-13 reviewed
    Scheduler runs multiple quantum jobs in parallel on linked QPUs

    QuMod: Parallel Quantum Job Scheduling on Modular QPUs using Circuit Cutting

    Vinooth Kulkarni +5

  8. cs.NI 2026-04-13 reviewed
    Different GPU splits across LLMs change quality by 87% at fixed latency

    RouterWise: Joint Resource Allocation and Routing for Latency-Aware Multi-Model LLM Serving

    Hossein Hosseini Kasnavieh +2

  9. cs.DC 2026-04-12 reviewed
    Hybrid backend speeds cross-silo FL up to 3.8x for large models

    Understanding Communication Backends in Cross-Silo Federated Learning

    Amir Ziashahabi +2

  10. eess.SY 2026-04-12 reviewed
    AI workload mix smooths power variability but keeps fast ramps

    Workload composition smooths aggregate power demand while sustaining short-horizon ramps in AI data centers

    Subir Majumder +2

  11. cs.DC 2026-04-12 reviewed
    Thinning to degree two extends data center stability region

    Bipartite matching under communication constraints

    Moonmoon Mohanty +5

  12. cs.CR 2026-04-12 reviewed
    Protocol hides verifier claim choices from holders

    COD-ssi: Enforcing Mutual Privacy for Credential Oblivious Disclosure in Self Sovereign Identity

    Elia Onofri +4

  13. cs.DC 2026-04-12 reviewed
    Stackelberg game optimizes incentives and privacy noise in federated learning

    FEDBUD: Joint Incentive and Privacy Optimization for Resource-Constrained Federated Learning

    Tao Liu +1

  14. cs.DC 2026-04-12 reviewed
    One CIR image deploys on any platform after lazy build

    CIR: Lightweight Container Image for Cross-Platform Deployment

    Fengzhi Li +8

  15. cs.DC 2026-04-12 reviewed
    LLMs derive exact GPU thread maps that cut energy use up to 4833x

    Leveraging Mathematical Reasoning of LLMs for Efficient GPU Thread Mapping

    Jose Maureira +3

  16. cs.DC 2026-04-11 reviewed
    Icicle indexes billion-file HPC systems in real time

    Icicle: Scalable Metadata Indexing and Real-Time Monitoring for HPC File Systems

    Haochen Pan +9

  17. cs.DC 2026-04-11 reviewed
    INCGuard verifies in-network computing for packet-loss risks

    Verifying In-Network Computing Systems for Design Risks

    Tianyu Bai +3

  18. cs.DC 2026-04-11 reviewed
    Deep unrolling turns SP routines into reusable RF sensing blocks

    RF-LEGO: Modularized Signal Processing-Deep Learning Co-Design for RF Sensing via Deep Unrolling

    Luca Jiang-Tao Yu +1

  19. cs.AR 2026-04-11 reviewed
    Sparse measurements predict latency at every CPU-GPU frequency

    Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge

    Jiesong Chen +3

  20. cs.DC 2026-04-11 reviewed
    Kernel disaggregation lifts heterogeneous GPU throughput by 2.3x

    Tessera: Unlocking Heterogeneous GPUs through Kernel-Granularity Disaggregation

    Tiancheng Hu +12

  21. cs.DC 2026-04-11 reviewed
    FlexVector speeds GCN inference 3.78x with flexible registers

    FlexVector: A SpMM Vector Processor with Flexible VRF for GCNs on Varying-Sparsity Graphs

    Bohan Li +5

  22. cs.LG 2026-04-11 reviewed
    Local adaptive steps multiply comms savings in decentralized training

    LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication

    Wei Liu +7

  23. cs.DC 2026-04-11 reviewed
    Microkernel validation eliminates harm from agent restarts

    Rebooting Microreboot: Architectural Support for Safe, Parallel Recovery in Microservice Systems

    Laurent Bindschaedler

  24. cs.DC 2026-04-10 reviewed
    System choices scale HPL to 1.01 EF/s FP64 with 11.5x mixed precision gain

    Sustaining Exascale Performance: Lessons from HPL and HPL-MxP on Aurora

    Kazushige Goto +5

  25. cs.CR 2026-04-10 reviewed
    Lone attackers poison federated learning models

    XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers

    Israt Jahan Mouri +2

  26. cs.LG 2026-04-10 reviewed
    NOMAD speeds up massive graph embeddings by 10-100x on CPU clusters

    NOMAD: Generating Embeddings for Massive Distributed Graphs

    Aishwarya Sarkar +3

  27. cs.DC 2026-04-10 reviewed
    Adaptive layer resolves LLM scaling paradox on NPUs

    A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs

    Chen Zhang +5

  28. cs.DC 2026-04-10 reviewed
    MATCHA cuts DNN inference latency up to 35% on heterogeneous edge SoCs

    MATCHA: Efficient Deployment of Deep Neural Networks on Multi-Accelerator Heterogeneous Edge SoCs

    Enrico Russo +8

  29. cs.DC 2026-04-10 reviewed
    Reference storage cuts LLM RL rollout stalls up to 19x

    TensorHub: Scalable and Elastic Weight Transfer for LLM RL Training

    Chenhao Ye +13

  30. cs.OS 2026-04-10 reviewed
    Adaptive quantization cuts mobile LLM cold starts by 4x

    EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices

    Yongsheng Yan +3

  31. cs.DC 2026-04-10 reviewed
    Right GPU cuts LLM energy use by 70% in servers

    Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures

    Mauricio Fadel Argerich +2

  32. cs.DC 2026-04-10 reviewed
    DAG consensus protocol lifts CFT throughput in wide-area nets

    Finding Nemo-Nemo: CFT DAG-based Consensus in the WAN

    Rithwik Kerur +5

  33. cs.DC 2026-04-09 reviewed
    Method scales sensor optimization to billion-DOF tsunami models on GPUs

    Sensor Placement for Tsunami Early Warning via Large-Scale Bayesian Optimal Experimental Design

    Sreeram Venkat +2

  34. cs.DC 2026-04-09 reviewed
    CPU offload over Nvlink-C2C fixes rigid GPU slice mismatches

    Taming GPU Underutilization via Static Partitioning and Fine-grained CPU Offloading

    Gabin Schieffer +3

  35. cs.DC 2026-04-09 reviewed
    Neural bandits learn better Kubernetes control-plane placements

    NL-CPS: Reinforcement Learning-Based Kubernetes Control Plane Placement in Multi-Region Clusters

    Sajid Alam +2

  36. cs.DC 2026-04-09 reviewed
    GPU HyperBall scales visibility graphs to 236k cells in 137 seconds

    City-Scale Visibility Graph Analysis via GPU-Accelerated HyperBall

    Alex Hodge +1

  37. cs.DC 2026-04-09 reviewed
    Causality arguments hold for quantum distributed snapshots

    Asynchronous Quantum Distributed Computing: Causality, Snapshots, and Global Operations

    Siddhartha Visveswara Jayanti +1

  38. cs.DC 2026-04-09 reviewed
    Joint algorithm minimizes weighted coflow time across OCS cores

    Scheduling Coflows in Multi-Core OCS Networks with Performance Guarantee

    Xin Wang +3

  39. cs.DC 2026-04-09 reviewed
    Speculative trees grow only when they cut inference time

    SMART: When is it Actually Worth Expanding a Speculative Tree?

    Lifu Wang +1

  40. cs.DC 2026-04-09 reviewed
    Energy-efficient GPUs deliver better value under budget limits

    Wattlytics: A Web Platform for Co-Optimizing Performance, Energy, and TCO in HPC Clusters

    Ayesha Afzal +2

  41. cs.DC 2026-04-09 reviewed
    Decomposed diffusion workflows handle 3x more requests

    LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows

    Lingyun Yang +12

  42. cs.DC 2026-04-09 reviewed
    Shared log makes LLM agent actions visible and stoppable

    LogAct: Enabling Agentic Reliability via Shared Logs

    Mahesh Balakrishnan +9

  43. cs.DC 2026-04-09 reviewed
    Beam speculation yields 1.4X LLM agent speedup on edge

    B-PASTE: Beam-Aware Pattern-Guided Speculative Execution for Resource-Constrained LLM Agents

    Yanfei Song

  44. cs.DC 2026-04-09 reviewed
    Decentralized edge agents lift mobile task success 21.7%

    Administrative Decentralization in Edge-Cloud Multi-Agent for Mobile Automation

    Senyao Li +5

  45. cs.DC 2026-04-09 reviewed
    Integrated panels give orbital AI 100 kW per ton

    Reduced-Mass Orbital AI Inference via Integrated Solar, Compute, and Radiator Panels

    Stephen Gaalema +2

  46. cs.DC 2026-04-08 reviewed
    No single config optimizes all goals in edge speculative LLM

    ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving

    Xiangchen Li +5

  47. cs.DC 2026-04-08 reviewed
    CPU-free LLM serving cuts P99 latency up to 8x

    Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC

    Mohammad Siavashi +4

  48. cs.CR 2026-04-08 reviewed
    Bonded identities and delay randomness fix MEV ordering

    MEV-ACE: Identity-Authenticated Fair Ordering for Proposer-Controlled MEV Mitigation

    Jian Sheng Wang

  49. cs.DS 2026-04-08 reviewed
    Batch algorithm updates maximal independent set in O(b log^3 n) work

    Parallel Batch-Dynamic Maximal Independent Set

    Guy Blelloch +4

  50. eess.SY 2026-04-08 reviewed
    AI workload power data scales to full data center energy profiles

    Measurement of Generative AI Workload Power Profiles for Whole-Facility Data Center Infrastructure Planning

    Roberto Vercellino (1) +9