pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 14

  1. cs.DC 2026-04-08 reviewed
    GROMACS runs deep-potential MD at scale on multi-GPU systems

    Making Room for AI: Multi-GPU Molecular Dynamics with Deep Potentials in GROMACS

    Luca Pennati +4

  2. cs.DC 2026-04-08 reviewed
    Disaggregating LoRA triples request rate under latency limits

    InfiniLoRA: Disaggregated Multi-LoRA Serving for Large Language Models

    Hongyu Chen +8

  3. cs.DC 2026-04-08 reviewed
    LLM serving policies rewrite themselves online for 34% gains

    Autopoiesis: A Self-Evolving System Paradigm for LLM Serving Under Runtime Dynamics

    Youhe Jiang +6

  4. cs.DC 2026-04-08 reviewed
    One LLM call compiles web tasks into JSON that runs forever at fixed low cost

    Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation

    Jagadeesh Chundru

  5. cs.DC 2026-04-08 reviewed
    Client scheduler hits 100% LLM deadlines at 4.2 requests per second

    Scheduling the Unschedulable: Taming Black-Box LLM Inference at Scale

    Renzhong Yuan +5

  6. cs.DC 2026-04-08 reviewed
    Nested pipelining gives 3x faster training on 1,500+ accelerators

    NestPipe: Large-Scale Recommendation Training on 1,500+ Accelerators via Nested Pipelining

    Zhida Jiang +14

  7. cs.DC 2026-04-08 reviewed
    Output-set tasks solvable under crashes iff inclusion graph connects

    On the Decidability of Distributed Tasks with Output Sets under Asynchrony and Any Number of Crashes

    Timoth\'e Albouy +4

  8. cs.PL 2026-04-08 reviewed
    Priorities and clocks extend CCS to define coherence

    Determinacy with Priorities up to Clocks

    Luigi Liquori (Centre Inria de l'Universit\'e C\^ote d'Azur) +2

  9. cs.DC 2026-04-08 reviewed
    Multi-robot service prototype runs on Aggregate Programming

    Exploiting Aggregate Programming in a Multi-Robot Service Prototype

    Giorgio Audrito (Dipartimento di Informatica +6

  10. cs.PL 2026-04-08 reviewed
    Effpi adds branching for external choice and timeouts

    Branching Out: Existential External Choice in Effpi

    Benjamin Robinson (University of Oxford) +1

  11. cs.DC 2026-04-08 reviewed
    Layer-by-layer freezing fits private LLM tuning on edge devices

    Beyond End-to-End: Dynamic Chain Optimization for Private LLM Adaptation on the Edge

    Yebo Wu +5

  12. cs.DC 2026-04-08 reviewed
    Nexus cuts serverless CPU use 44% by offloading I/O from VMs

    Nexus: Transparent I/O Offloading for High-Density Serverless Computing

    JooYoung Park +6

  13. cs.AR 2026-04-08 reviewed
    SwarmIO emulates 40M IOPS SSDs for GPUs with 300x speedup

    SwarmIO: Towards 100 Million IOPS SSD Emulation for Next-generation GPU-centric Storage Systems

    Hyeseong Kim +2

  14. cs.DC 2026-04-08 reviewed
    Foundry cuts LLM cold-start time from minutes to seconds

    Foundry: Template-Based CUDA Graph Context Materialization for Fast LLM Serving Cold Start

    Xueshen Liu +5

  15. cs.DC 2026-04-08 reviewed
    SpMM requires structure-specific roofline models for accurate bounds

    Sparsity-Aware Roofline Models for Sparse Matrix-Matrix Multiplication

    Matthew Qian +3

  16. cs.DC 2026-04-08 reviewed
    DynLP updates graph labels 13x faster on average by limiting propagation to changed sub-

    DynLP: Parallel Dynamic Batch Update for Label Propagation in Semi-Supervised Learning

    S M Shovan +4

  17. cs.DC 2026-04-08 reviewed
    Canceled spot requests yield availability signals at near-zero cost

    Ding-Dong Ditch: Peeking Into Spot Instance Availability

    Kyumin Kim +3

  18. cs.DC 2026-04-08 reviewed
    Adaptive sync raises IoT ledger recovery after partitions

    Contextual Chain: Single-State Ledger Design for Mobile/IoT Networks with Frequent Partitions

    Song-Ju Kim

  19. cs.DC 2026-04-07 reviewed
    Copy-on-write KV cache triples multi-LoRA agent throughput

    ForkKV: Scaling Multi-LoRA Agent Serving via Copy-on-Write Disaggregated KV Cache

    Shao Wang +2

  20. cs.DC 2026-04-07 reviewed
    Power reconstruction shows 79% energy cut from mixed precision on Frontier

    Fine-Grained Power and Energy Attribution on AMD GPU/APU-Based Exascale Nodes

    Adam McDaniel +10

  21. cs.DC 2026-04-07 reviewed
    Codec signals triple VLM streaming throughput

    CodecSight: Leveraging Video Codec Signals for Efficient Streaming VLM Inference

    Yulin Zou +7

  22. cs.DC 2026-04-07 reviewed
    GTaP runtime runs fork-join tasks on GPUs faster than CPU OpenMP

    GTaP: A GPU-Resident Fork-Join Task-Parallel Runtime with a Pragma-Based Interface

    Yuki Maeda +1

  23. cs.DC 2026-04-07 reviewed
    Morton plane trees speed GPU neighbor search by over 10x

    JZ-Tree: GPU friendly neighbour search and friends-of-friends with dual tree walks in JAX plus CUDA

    Jens St\"ucker +4

  24. cs.DC 2026-04-07 reviewed
    Linearizable registers force extensive message chains

    Communication Requirements for Linearizable Registers

    Ra\"issa Nataf +1

  25. cs.DC 2026-04-07 reviewed
    Go runtime outperforms Python and Node.js for OpenFaaS on Kubernetes

    Optimizing OpenFaaS on Kubernetes: Comparative Analysis of Language Runtimes and Cluster Distributions

    Ehsan Ataie +2

  26. cs.LG 2026-04-07 reviewed
    ALTO speeds LoRA tuning 13.8x via early stops and shared scheduling

    ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads

    Jingwei Zuo +7

  27. cs.DC 2026-04-06 reviewed
    Persistent Alltoallv cuts MPI runtime up to 44% for large messages

    Analyzing Persistent Alltoallv RMA Implementations for High-Performance MPI Communication

    Evelyn Namugwanya

  28. cs.CL 2026-04-06 reviewed
    Single GPU trains 120B-parameter models at full precision

    MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

    Zhengqing Yuan +3

  29. cs.DC 2026-04-06 reviewed
    Decentralized relayers route cross-chain messages without hubs

    Towards Policy-Enabled Multi-Hop Routing for Cross-Chain Message Delivery

    Amin Rezaei +2

  30. cs.AR 2026-04-06 reviewed
    Tool explores 250 trillion 3D AI accelerator designs 100000 times faster

    DeepStack: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI Accelerators

    Zhiwen Mo +13

  31. cs.CR 2026-04-06 reviewed
    RegGuard cuts optimistic rollup settlement failures by over 90 percent

    RegGuard: Legitimacy and Fairness Enforcement for Optimistic Rollups

    Zhenhang Shang +2

  32. cs.DC 2026-04-06 reviewed
    Execution-idle wastes 10.7% of GPU cluster energy

    The Energy Cost of Execution-Idle in GPU Clusters

    Yiran Lei +6

  33. cs.LG 2026-04-06 reviewed
    Sampling parallelism scales Bayesian training linearly across GPUs

    Sampling Parallelism for Fast and Efficient Bayesian Learning

    Asena Karolin \"Ozdemir +5

  34. cs.DC 2026-04-06 reviewed
    Splitting LLMs across LEO satellites cuts delay by 42%

    Communication-Efficient Collaborative LLM Inference over LEO Satellite Networks

    Songge Zhang +4

  35. cs.DC 2026-04-06 reviewed
    Zero downtime achieved in edge energy service migration

    Edge-Oriented Orchestration of Energy Services Using Graph-Driven Swarm Intelligence

    Liana Toderean +5

  36. cs.DC 2026-04-06 reviewed
    Single-agent exploration in dynamic graphs needs Omega(m) window

    Tight Bounds on Window Size and Time for Single-Agent Graph Exploration under T-Interval Connectivity

    Yuichi Sudo +6

  37. cs.DC 2026-04-06 reviewed
    Layout propagation removes redundant packing in GEMM sequences

    LP-GEMM: Integrating Layout Propagation into GEMM Operations

    C\'esar Guedes Carneiro +3

  38. cs.DC 2026-04-06 reviewed
    Slurm tool simplifies submissions and defers jobs to cut energy use

    NBI-Slurm: Simplified submission of Slurm jobs with energy saving mode

    Andrea Telatin

  39. cs.AI 2026-04-06 reviewed
    AI peer review platform detects fake citations over 85 percent of the time

    OpenCLAW-P2P v7.0-P2PCLAW: Resilient Multi-Layer Persistence, Live Reference Verification, and Production-Scale Evaluation of Decentralized AI Peer Review v7.0 -- Mathematical Corrections & Ecosystem Developments Edition

    Francisco Angulo de Lafuente +5

  40. cs.AI 2026-04-06 reviewed
    AI agents run peer review with 85% fabricated-citation detection

    OpenCLAW-P2P v7.0-P2PCLAW: Resilient Multi-Layer Persistence, Live Reference Verification, and Production-Scale Evaluation of Decentralized AI Peer Review v7.0 -- Mathematical Corrections & Ecosystem Developments Edition

    Francisco Angulo de Lafuente +5

  41. cs.DC 2026-04-06 reviewed
    Satellite emulators tested against real data show clear gaps

    An experimental evaluation of satellite constellation emulators

    Victor Cionca +3

  42. cs.DC 2026-04-06 reviewed
    Co-serving system raises SLO attainment for mixed diffusion workloads by up to 44%

    GENSERVE: Efficient Co-Serving of Heterogeneous Diffusion Model Workloads

    Fanjiang Ye +12

  43. cs.DC 2026-04-05 reviewed
    Ledger state serves as shared environment for agent coordination

    Ledger-State Stigmergy: A Formal Framework for Indirect Coordination Grounded in Distributed Ledger State

    Fernando Paredes Garc\'ia

  44. cs.DC 2026-04-05 reviewed
    Lemonshark cuts async BFT latency up to 65% with early finality

    Lemonshark: Asynchronous DAG-BFT With Early Finality

    Michael Yiqing Hu +4

  45. cs.CR 2026-04-04 reviewed
    SecureAFL detects bad updates and estimates missing ones in async FL

    SecureAFL: Secure Asynchronous Federated Learning

    Anjun Gao +5

  46. quant-ph 2026-04-04 reviewed
    GPU simulator speeds quantum circuits up to 146x over CPU

    GPU-Accelerated Quantum Simulation: Empirical Backend Selection, Gate Fusion, and Adaptive Precision

    Poornima Kumaresan +3

  47. quant-ph 2026-04-03 reviewed
    Four-layer middleware adapts hybrid quantum-HPC resources at runtime

    Hybrid Quantum-HPC Middleware Systems for Adaptive Resource, Workload and Task Management

    Pradeep Mantha +4

  48. cs.CR 2026-04-03 reviewed
    Hybrid parallelism scales encrypted Transformers across multiple GPUs

    AEGIS: Scaling Long-Sequence Homomorphic Encrypted Transformer Inference via Hybrid Parallelism on Multi-GPU Systems

    Zhaoting Gong +3

  49. cs.NI 2026-04-03 reviewed
    Granger causality quantifies noisy neighbor effects up to 67% slowdown

    Causal Inference for Quantifying Noisy Neighbor Effects in Multi-Tenant Cloud Environments

    Philipe S. Schiavo +8

  50. cs.DC 2026-04-03 reviewed
    Collective KV sharing runs 2.7x more multi-agent LLM agents

    TokenDance: Scaling Multi-Agent LLM Serving via Collective KV Cache Sharing

    Zhuohang Bian +5