pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 19

  1. cs.DC 2025-11-04 reviewed
    Simpler recovery fixes leaderless Paxos and generalizes it optimally

    Making Democracy Work: Fixing and Simplifying Egalitarian Paxos (Extended Version)

    Fedor Ryabinin +2

  2. cs.LG 2025-11-03 reviewed
    Neuromorphic chip learns new classes with 113x lower latency

    Online Continual Learning on Intel Loihi 2 via a Co-designed Spiking Neural Network

    Elvin Hajizada +7

  3. cs.DC 2025-11-03 reviewed
    Stable errors bound local clock skew by O(Δ + δ log D)

    Gradient Clock Synchronization with Practically Constant Local Skew

    Christoph Lenzen

  4. cs.DC 2025-10-31 reviewed
    Uniform RDMA WriteImm interface reaches 400 Gbps on NVIDIA and AWS NICs

    fabric-lib: RDMA Point-to-Point Communication for LLM Systems

    Nandor Licker (1) +3

  5. cs.DC 2025-10-31 reviewed
    kNN predicts good sub-system sizes for GPU tridiagonal partition

    ML-Based Optimum Sub-system Size Heuristic for the GPU Implementation of the Tridiagonal Partition Method

    Milena Veneva

  6. cs.AI 2025-10-31 reviewed
    AI matches human experts in designing LLM cluster algorithms

    Glia: A Human-Inspired AI for Automated Systems Design and Optimization

    Pouya Hamadanian +7

  7. cs.LG 2025-10-25 reviewed
    Dictator clients erase honest contributions in federated learning

    Power to the Clients: Federated Learning in a Dictatorship Setting

    Mohammadsajad Alipour +1

  8. cs.DC 2025-10-24 reviewed
    Quasipolylog rounds for (Δ+1)-coloring when neighborhood independence is bounded

    Distributed $(\Delta+1)$-Coloring in Graphs of Bounded Neighborhood Independence

    Marc Fuchs +1

  9. cs.NI 2025-10-22 reviewed
    SWOT cuts collective communication time up to 89.7% by overlapping reconfiguration

    Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks

    Changbo Wu +3

  10. cs.DC 2025-10-22 reviewed
    Spot GPUs raise LLM RL throughput 1.5-2x at 28-49% lower cost

    RLBoost: Harvesting Preemptible Resources for Cost-Efficient Reinforcement Learning on LLMs

    Yongji Wu +7

  11. cs.CL 2025-10-21 reviewed
    Sparse attention trains 512K-context LLMs at 6x speed

    MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

    Wenxuan Li +5

  12. cs.DC 2025-10-21 reviewed
    TokenCake trims multi-agent LLM latency over 47% with smart KV cache moves

    TokenCake: A KV-Cache-centric Serving Framework for LLM-based Multi-Agent Applications

    Zhuohang Bian +4

  13. cs.NI 2025-10-20 reviewed
    New algorithm sustains node activity to cut broadcast latency

    A New Broadcast Model for Several Network Topologies

    Hongbo Lu +5

  14. cs.DC 2025-10-17 reviewed
    Statistical method quantifies probabilistic training time guarantees at scale

    PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training

    Alicia Golden +6

  15. cs.DC 2025-10-15 reviewed
    Decode rescheduling cuts LLM P99 TPOT by 75%

    STAR: Decode-Phase Rescheduling for LLM Inference

    Zhibin Wang +10

  16. cs.DC 2025-10-13 reviewed
    FlexPipe cuts reserved GPUs for LLM serving from 75% to 30% of peak

    FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless Clusters

    Yanying Lin +4

  17. cs.LG 2025-10-09 reviewed
    Layered prefill slashes MoE TTFT by 70% without stalls

    From Tokens to Layers: Redefining Stall-Free Scheduling for MoE Serving with Layered Prefill

    Gunjun Lee +4

  18. cs.LG 2025-10-09 reviewed
    Sketches cut communication waste in Byzantine DFL

    SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

    Murtaza Rangwala +3

  19. cs.DC 2025-10-08 reviewed
    Framework measures real makespans from abstract graphs on CPU-GPU-FPGA hardware

    Evaluating Rapid Makespan Predictions for Heterogeneous Systems with Programmable Logic

    Martin Wilhelm +3

  20. cs.DC 2025-10-07 reviewed
    Async algorithm takes consistent snapshots with O(n) messages

    Asynchronous Checkpoint for Eventually Consistent Databases

    Raaghav Ravishankar +2

  21. cs.LG 2025-10-07 reviewed
    Fused models win on long-range atomistic properties

    When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning

    Arindam Chowdhury +1

  22. cs.DC 2025-10-07 reviewed
    Profiling uncovers patterns that speed up large MoE inference 6.6x

    Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference

    Zhongkai Yu +8

  23. cs.AI 2025-10-05 reviewed
    Speculative predictions cut agent latency up to 20 percent

    Speculative Actions: A Lossless Framework for Faster Agentic Systems

    Naimeng Ye +5

  24. cs.DC 2025-10-05 reviewed
    Canonical rounds block optimal Byzantine consensus

    Why Canonical Rounds Fail for Optimal Byzantine Resilience

    Hagit Attiya +2

  25. cs.DC 2025-10-03 reviewed
    GPU data-movement cuts lower both time and energy for large sparse solves

    On the energy efficiency of sparse matrix computations on multi-GPU clusters

    Massimo Bernaschi +3

  26. cs.DC 2025-09-29 reviewed
    GRACE-MoE speeds up distributed MoE inference up to 4.66x

    GRACE-MoE: Grouping and Replication with Locality-Aware Routing for Efficient Distributed MoE Inference

    Yu Han +6

  27. cs.DC 2025-09-29 reviewed
    Harp speeds heterogeneous GPU training by 1.3x-1.6x

    HARP: Orchestrating Automated Parallel Training on Heterogeneous GPU Clusters

    Antian Liang +8

  28. cs.CR 2025-09-29 reviewed
    Graph model spots silent smart-grid eavesdroppers at 98% accuracy

    Federated Spatiotemporal Graph Learning for Passive Attack Detection in Smart Grids

    Bochra Al Agha +1

  29. cs.CR 2025-09-29 reviewed
    Simulator detects TON contract race conditions missed by static checks

    BugMagnifier: TON Transaction Simulator for Revealing Smart Contract Vulnerabilities

    Yury Yanovich +8

  30. cs.DC 2025-09-27 reviewed
    132k FaaS workflow runs on AWS and Azure show scaling and cost patterns

    Characterizing FaaS Workflows on Public Clouds: The Good, the Bad and the Ugly

    Varad Kulkarni +5

  31. cs.DC 2025-09-25 reviewed
    Modular bricks cut multimodal AI energy by 42% on small batteries

    Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices

    Yilong Li +7

  32. cs.DC 2025-09-25 reviewed
    Elastic PP achieves 1.69x speedup for long-context LLM training

    InfiniPipe: Elastic Pipeline Parallelism for Efficient Variable-Length Long-Context LLM Training

    Shiju Wang +7

  33. cs.LG 2025-09-24 reviewed
    Frontier AI models average 0.34 Wh per query on real hardware

    Energy Use of AI Inference, Efficiency Pathways, and Test-Time Scaling

    Felipe Oviedo +7

  34. cs.DC 2025-09-24 reviewed
    Dynamic TP changes boost LLM inference throughput 1.75x-6.57x

    Amoeba: Runtime Tensor Parallel Transformation for LLM Inference Services

    Haoyu Chen +5

  35. cs.LG 2025-09-22 reviewed
    Metric picks key workers to steady swarm learning on uneven data

    Multi-Worker Selection based Distributed Swarm Learning for Edge IoT with Non-i.i.d. Data

    Zhuoyu Yao +5

  36. q-bio.NC 2025-09-16 reviewed
    Tuned whole-brain model gains alpha rhythms and complexity

    Emergent complexity and rhythms in evoked and spontaneous dynamics of human whole-brain models after tuning through analysis tools

    Gianluca Gaglioti +12

  37. cs.CR 2025-09-13 reviewed
    TON checklist built from 233 real audit findings

    From Paradigm Shift to Audit Rift: Empirical Analysis and Validation of Security Audit Methodologies for Asynchronous Smart Contract Systems

    Yury Yanovich +8

  38. physics.comp-ph 2025-09-10 reviewed
    Single code base runs radiation hydrodynamics on any hardware scale

    HARD: A Performance Portable Radiation Hydrodynamics Code based on FleCSI Framework

    Julien Loiseau +9

  39. cs.DC 2025-09-09 reviewed
    Dual-phase expert scheduling cuts MoE LLM latency up to 7.55x

    DuoServe-MoE: Dual-Phase Expert Prefetch and Caching for LLM Inference QoS Assurance

    Yuning Zhang +4

  40. cs.LG 2025-09-03 reviewed
    DP training gains 2.21x throughput with dynamic layer quantization

    DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling

    Yubo Gao +3

  41. cs.DC 2025-08-29 reviewed
    Chameleon recovers training within 11% of normal speed after faults

    Chameleon: Adaptive Fault Tolerance for Distributed Training via Real-time Policy Selection

    Yuhang Zhou +14

  42. quant-ph 2025-08-26 reviewed
    Resource estimates find feasible setups for distributed quantum computers

    Architecting Distributed Quantum Computers: Design Insights from Resource Estimation

    Dmitry Filippov +2

  43. cs.DC 2025-08-22 reviewed
    Default collectives up to 5x slower than tuned choices

    PICO: Performance Insights for Collective Operations

    Saverio Pasqualoni +5

  44. cs.DC 2025-08-21 reviewed
    HFX raises LLM SLO attainment 4.44x with joint scheduling and scaling

    HFX: Joint Design of Algorithms and Systems for Multi-SLO Serving and Fast Scaling

    Zahra Yousefijamarani +14

  45. cs.DC 2025-08-21 reviewed
    CausalMesh keeps caches consistent during client migrations

    CausalMesh: A Formally Verified Causally Consistent Distributed Cache with Support for Client Migration

    Haoran Zhang +4

  46. cs.DC 2025-08-21 reviewed
    Engine cuts mixed-precision LLM latency by up to 61 percent

    LMDeploy Accelerates Mixed-Precision LLM Inference with TurboMind

    Li Zhang +8

  47. cs.LG 2025-08-20 reviewed
    Client KMeans filtering yields near-IID results in federated distillation

    Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data

    Ahmed Mujtaba +3

  48. cs.CC 2025-08-19 reviewed
    Production control alone realizes any polynomial analog dynamics

    Analog computation with transcriptional networks

    David Doty +2

  49. cs.DC 2025-08-18 reviewed
    Expert placement strategy cuts MoE edge latency up to 30%

    Accelerating Edge Inference for Distributed MoE Models with Latency-Optimized Expert Placement

    Tian Wu +7

  50. cs.DC 2025-08-11 reviewed
    OPEN predicts GPU performance at 98% accuracy with minimal profiling

    Coordinated Power Management on Heterogeneous Systems

    Zhong Zheng +4