pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 12

  1. cs.DC 2026-04-16 reviewed
    Invariants let agents match hand-optimized GPU kernels

    ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants

    Haohui Mai +9

  2. cs.AR 2026-04-16 reviewed
    SCENIC hits 200G SmartNIC speed with programmable stream units

    SCENIC: Stream Computation-Enhanced SmartNIC

    Benjamin Ramhorst +6

  3. cs.DC 2026-04-16 reviewed
    Hybrid models let prefill run in a separate datacenter

    Prefill-as-a-Service: KVCache of Next-Generation Models Could Go Cross-Datacenter

    Ruoyu Qin +7

  4. cs.DC 2026-04-16 reviewed
    Closed forms give exact multi-NUMA VM counts per host

    Efficient calculation of available space for multi-NUMA virtual machines

    Andrei Gudkov +2

  5. cs.DC 2026-04-16 reviewed
    Block placement and cache rules cut LLM serving latency

    Serving Chain-structured Jobs with Large Memory Footprints with Application to Large Foundation Model Serving

    Tingyang Sun +2

  6. cs.AI 2026-04-16 reviewed
    Game equilibria set synthetic data volumes in coopetitive learning

    Cooperate to Compete: Strategic Data Generation and Incentivization Framework for Coopetitive Cross-Silo Federated Learning

    Thanh Linh Nguyen +2

  7. cs.IT 2026-04-16 reviewed
    FL compression gains depend on correlation strength

    Exploiting Correlations in Federated Learning: Opportunities and Practical Limitations

    Adrian Edin +3

  8. cs.LG 2026-04-16 reviewed
    MoE serving gains 6.6x speedup via elastic self-speculation on 3D stacks

    ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving

    Yuseon Choi +7

  9. cs.DC 2026-04-16 reviewed
    Direct propagation matches locality lower bound in distributed DP

    Locality, Not Spectral Mixing, Governs Direct Propagation in Distributed Offline Dynamic Programming

    Ibne Farabi Shihab

  10. cs.DC 2026-04-16 reviewed
    Forkable shared logs let AI agents branch streaming data

    AgileLog: A Forkable Shared Log for Agents on Data Streams

    Shreesha G. Bhat +4

  11. cs.DC 2026-04-16 reviewed
    CoCoDiff speeds up distributed DiT inference 3.6x on average

    CoCoDiff: Optimizing Collective Communications for Distributed Diffusion Transformer Inference Under Ulysses Sequence Parallelism

    Bin Ma +4

  12. cs.DS 2026-04-16 reviewed
    Registers achieve log P latency despite contention

    Fast Concurrent Primitives Despite Contention

    Michael A. Bender +6

  13. cs.DB 2026-04-15 reviewed
    PIM hardware speeds R-tree queries up to 3.66x with less energy

    Parallel R-tree-based Spatial Query Processing on a Commercial Processing-in-Memory System

    Tasmia Jannat +2

  14. quant-ph 2026-04-15 reviewed
    VQLS cuts circuit count 256x for 10-qubit systems

    Distributed Variational Quantum Linear Solver

    Chao Lu +3

  15. cs.DC 2026-04-15 reviewed
    GPU hypergraph partitioner reaches 940x speedup with improved quality

    Incidence Constraints in Hypergraph Partitioning on GPU

    Marco Ronzani +1

  16. cs.CR 2026-04-15 reviewed
    Five themes together build cyber-physical resilience

    Digital Guardians: The Past and The Future of Cyber-Physical Resilience

    Saurabh Bagchi +22

  17. cs.CR 2026-04-15 reviewed
    Finite withholding beats infinite withholding by unbounded factor in pools

    Temporary Power Adjusting Withholding Attack

    Mustafa Doger +1

  18. cs.CR 2026-04-15 reviewed
    Temporary withholding boosts pool attack rewards 22x over permanent version

    Temporary Power Adjusting Withholding Attack

    Mustafa Doger +1

  19. cs.DC 2026-04-15 reviewed
    Inference tasks replace mining in AI blockchain consensus

    HadAgent: Harness-Aware Decentralized Agentic AI Serving with Proof-of-Inference Blockchain Consensus

    Landy Jimenez +5

  20. cs.DC 2026-04-15 reviewed
    OffloadFS moves database compaction to storage nodes for 3.36x speedup

    OffloadFS: Leveraging Disaggregated Storage for Computation Offloading

    Sungho Moon +6

  21. cs.CR 2026-04-15 reviewed
    Encrypted face data counts crowds without naming anyone

    Head Count: Privacy-Preserving Face-Based Crowd Monitoring

    Fatemeh Marzani +3

  22. cs.DC 2026-04-15 reviewed
    Open Ethernet HPC cluster ranks 49th on TOP500

    SAKURAONE: An Open Ethernet-Based AI HPC System and Its Observed Workload Dynamics in a Single-Tenant LLM Development Environment

    Fumikazu Konishi +2

  23. cs.RO 2026-04-15 reviewed
    Adaptive edge system raises robotics AI service quality

    Self-adaptive Multi-Access Edge Architectures: A Robotics Case

    Mahyar T Moghaddam +2

  24. cs.CR 2026-04-15 reviewed
    Distributed servers with MPC cut costs for private vertical federated learning

    Secure and Privacy-Preserving Vertical Federated Learning

    Shan Jin +4

  25. cs.DC 2026-04-15 reviewed
    PackSELL packs deltas and values to speed GPU SpMV 1.63x in FP16

    PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV

    Kengo Suzuki +1

  26. cs.DC 2026-04-14 reviewed
    Event Tensor abstraction compiles dynamic megakernels

    Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel

    Hongyi Jin +20

  27. cs.DC 2026-04-14 reviewed
    DySkew cuts UDF skew delays with runtime data swaps

    DySkew: Dynamic Data Redistribution for Skew-Resilient Snowpark UDF Execution

    Chenwei Xie +10

  28. cs.DC 2026-04-14 reviewed
    Academia trains 70B open LLM on Alps supercomputer

    An Engineering Journey Training Large Language Models at Scale on Alps: The Apertus Experience

    Jonathan Coles +22

  29. cs.PL 2026-04-14 reviewed
    Virtual machine speeds array programs 147x on GPUs

    Towards a Linear-Algebraic Hypervisor

    Breandan Considine

  30. cs.AR 2026-04-14 reviewed
    EPAC RISC-V chip with three tiles taped out in 22nm

    EPAC: The Last Dance

    Filippo Mantovani +38

  31. cs.DC 2026-04-14 reviewed
    ML ensemble cuts CI memory waste by 36 GB per build

    Intelligent resource prediction for SAP HANA continuous integration build workloads

    Torsten Mandel +3

  32. cs.DC 2026-04-14 reviewed
    Hybrid platform extends supercomputers to full AI model lifecycle

    Beyond Pre-Training: The Full Lifecycle of Foundation Models on HPC Systems

    Dino Conciatore +6

  33. cs.DC 2026-04-14 reviewed
    The paper proposes pAirZero, a framework combining zeroth-order optimization and…

    Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization

    Zhijie Cai +5

  34. cs.DC 2026-04-14 reviewed
    Local routing plus compression cuts cloud LLM tokens 45-79%

    Local-Splitter: A Measurement Study of Seven Tactics for Reducing Cloud LLM Token Usage on Coding-Agent Workloads

    Justice Owusu Agyemang +4

  35. cs.OS 2026-04-14 reviewed
    MARS cuts agentic latency by 5.94x via co-scheduling

    MARS: Efficient, Adaptive Co-Scheduling for Heterogeneous Agentic Systems

    Yifei Wang +10

  36. cs.AR 2026-04-14 reviewed
    Compiler cuts NPU transformer energy use by up to 41%

    Forge-UGC: FX optimization and register-graph engine for universal graph compiler

    Satyam Kumar +1

  37. cs.LG 2026-04-14 reviewed
    Levy jumps fix trapping in decentralized random walks

    Decentralized Learning via Random Walk with Jumps

    Zonghong Liu +2

  38. cs.DC 2026-04-14 reviewed
    Periodic framework organizes distributed computing

    A Periodic Space of Distributed Computing: Vision & Framework

    Mohsen Amini Salehi +7

  39. cs.LG 2026-04-14 reviewed
    Physics-informed DLinear forecasts AI data center power more accurately

    A Physics-Aware Framework for Short-Term GPU Power Forecasting of AI Data Centers

    Mohammad AlShaikh Saleh +4

  40. cs.DC 2026-04-14 reviewed
    BlazingAML matches AML accuracy at 210x CPU speed

    BlazingAML: High-Throughput Anti-Money Laundering (AML) via Multi-Stage Graph Mining

    Haojie Ye +4

  41. cs.DC 2026-04-14 reviewed
    Live pipeline changes cut LLM first-token time by 2.5X

    PipeLive: Efficient Live In-place Pipeline Parallelism Reconfiguration for Dynamic LLM Serving

    Xu Bai +3

  42. cs.AI 2026-04-13 reviewed
    Reference-based replication creates AI agents in constant time

    Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents

    Swanand Rao +3

  43. cs.DC 2026-04-13 reviewed
    StableHLO unifies ML performance modeling across GPUs and TPUs

    Evaluating Cross-Architecture Performance Modeling of Distributed ML Workloads Using StableHLO

    Jonas Svedas +8

  44. cs.DC 2026-04-13 reviewed
    Pipelined Parareal on GPUs speeds microswimmer simulations

    Accelerating Microswimmer Simulations via a Heterogeneous Pipelined Parallel-in-Time Framework

    Ruixiang Huang +1

  45. cs.DC 2026-04-13 reviewed
    Bayesian Noisy-OR model cuts failure detection time by 60%

    Predictive Bayesian Arbitration: A Scalable Noisy-OR Model with Service Criticality Awareness

    Anil Jangam +2

  46. cs.SE 2026-04-13 reviewed
    Remote Git service delivers monorepo checkouts in under a second

    GitFarm: Git as a Service for Large-Scale Monorepos

    Preetam Dwivedi +2

  47. cs.DC 2026-04-13 reviewed
    Visual analytics clusters HPC nodes to expose behavioral differences

    Understanding Large-Scale HPC System Behavior Through Cluster-Based Visual Analytics

    Allison Austin +6

  48. cs.LG 2026-04-13 reviewed
    Residual bottlenecks deliver 128x activation compression for pipelines

    ResBM: Residual Bottleneck Models for Low-Bandwidth Pipeline Parallelism

    Alan Aboudib +3

  49. cs.OS 2026-04-13 reviewed
    Nanvix cuts serverless server needs by 20-100x

    Nanvix: A Multikernel OS Design for High-Density Serverless Deployments

    Carlos Segarra +6

  50. cs.CR 2026-04-13 reviewed
    Sparse FHE matmul on GPUs runs up to 3x faster than CPU

    GPU Acceleration of Sparse Fully Homomorphic Encrypted DNNs

    Lara D'Agata +9