pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 6

  1. cs.DC 2026-05-07 reviewed
    AD replaces finite differences in INLA for 4-8x gradient speedups

    ADELIA: Automatic Differentiation for Efficient Laplace Inference Approximations

    Afif Boudaoud +8

  2. cs.DC 2026-05-07 reviewed
    ResiHP keeps LLM training fast by adapting to GPU failures

    ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism

    Tenghui Ma +6

  3. cs.DC 2026-05-07 reviewed
    ResiHP lifts LLM training speed 1–4× under real GPU failures

    ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism

    Tenghui Ma +6

  4. cs.AI 2026-05-07 reviewed
    Sfactory unifies three platforms into one agent training pipeline

    Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence

    Xinquan Chen +38

  5. cs.AI 2026-05-07 reviewed
    Three platforms linked into one pipeline for autonomous agents

    Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence

    Xinquan Chen +38

  6. cs.DC 2026-05-07 reviewed
    TACO toolsuite verifies threshold automata for distributed algorithms

    TACO: A Toolsuite for the Verification of Threshold Automata

    Paul Eichler +5

  7. cs.DC 2026-05-07 reviewed
    BalanceRoute cuts DP imbalance in LLM serving

    Tackling the Data-Parallel Load Balancing Bottleneck in LLM Serving: Practical Online Routing at Scale

    Tianci Bu +8

  8. cs.DC 2026-05-07 reviewed
    Router cuts data-parallel imbalance in LLM clusters

    Tackling the Data-Parallel Load Balancing Bottleneck in LLM Serving: Practical Online Routing at Scale

    Tianci Bu +8

  9. cs.AI 2026-05-07 reviewed
    AI agents generate custom LLM serving systems competitive with vLLM

    VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

    Keisuke Kamahori +3

  10. cs.DC 2026-05-07 reviewed
    Automated low-complexity matrix multiplies beat hardware peaks

    FalconGEMM: Surpassing Hardware Peaks with Lower-Complexity Matrix Multiplication

    Honglin Zhu +12

  11. cs.DC 2026-05-07 reviewed
    FalconGEMM exceeds GEMM speeds by 7-17% via lower-complexity algorithms

    FalconGEMM: Surpassing Hardware Peaks with Lower-Complexity Matrix Multiplication

    Honglin Zhu +12

  12. cs.DC 2026-05-07 reviewed
    MoE cuts relay buffers with direct expert-window access

    Relay Buffer Independent Communication over Pooled HBM for Efficient MoE Inference on Ascend

    Tianlun Hu +10

  13. cs.DC 2026-05-07 reviewed
    Direct expert-window access removes relay buffers in MoE inference

    Relay Buffer Independent Communication over Pooled HBM for Efficient MoE Inference on Ascend

    Tianlun Hu +10

  14. cs.AI 2026-05-07 reviewed
    Structural alignment beats coordinate matching for heterogeneous prototypes

    From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning

    Xinghao Wu +5

  15. cs.PL 2026-05-07 reviewed
    Cache-free GPU enumeration outperforms priors on MBA synthesis

    GPU-Accelerated Synthesis of Mixed-Boolean Arithmetic: Beyond Caching

    Gabriel Bathie +2

  16. cs.AR 2026-05-07 reviewed
    Hardware hub lets MoE send data before knowing GPU addresses

    MoE-Hub: Taming Software Complexity for Seamless MoE Overlap with Hardware-Accelerated Communication on Multi-GPU Systems

    Zhuoshan Zhou +12

  17. cs.LO 2026-05-07 reviewed
    Gossip protocols fix their own faulty messages

    Self-Correcting Gossip Protocols

    Giorgio Cignarale +5

  18. cs.CR 2026-05-07 reviewed
    Gas Cards replace off-chain signers in ERC-4337 paymasters

    SuperPaymaster: Eliminating Centralized Signer Authority via Asset-Oriented Abstraction to Reconcile Usability and Decentralization in Account Abstraction

    Huifeng Jiao +1

  19. cs.DC 2026-05-07 reviewed
    Differential privacy keeps edge ML fast and harder to steal

    A Privacy-Preserving Machine Learning Framework for Edge Intelligence: An Empirical Analysis

    Quoc Lap Trieu +2

  20. cs.DC 2026-05-07 reviewed
    LLM priors raise DRL task offloading success by over 17%

    LLM-Enhanced Deep Reinforcement Learning for Task Offloading in Collaborative Edge Computing

    Hao Guo +3

  21. cs.DC 2026-05-07 reviewed
    MLA cache recovers 83% tokens despite position shifts

    Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving

    Bole Ma +2

  22. cs.AR 2026-05-07 reviewed
    New in-switch method delivers 1.38x faster LLM tensor parallel training

    Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems

    Chen Zhang +12

  23. cs.AR 2026-05-07 reviewed
    DySHARP speeds MoE models 1.79x with dynamic in-switch computing

    Accelerating MoE with Dynamic In-Switch Computing on Multi-GPUs

    Qijun Zhang +12

  24. cs.DC 2026-05-07 reviewed
    Digital twin framework cuts data center power use with predictions

    A Scalable Digital Twin Framework for Energy Optimization in Data Centers

    Raphael Hendrigo de Souza Gon\c{c}alves +1

  25. cs.DC 2026-05-07 reviewed
    EdgeServing cuts SLO violations for multi-DNN edge serving

    EdgeServing: Deadline-Aware Multi-DNN Serving at the Edge

    Jiahe Cao +5

  26. cs.LG 2026-05-06 reviewed
    Simulation platform tests datacenter power flexibility for grid coordination

    OpenG2G: A Simulation Platform for AI Datacenter-Grid Runtime Coordination

    Jae-Won Chung +5

  27. cs.DC 2026-05-06 reviewed
    Dynamic tensor parallelism raises LLM goodput up to 5.3x

    Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism

    Vikranth Srivatsa +4

  28. cs.DC 2026-05-06 reviewed
    Nine-dimension model explains root causes in five of twelve DeFi incidents

    Toward a Risk Assessment Framework for Institutional DeFi: A Nine-Dimension Approach

    Eva Oberholzer +3

  29. cs.DC 2026-05-06 reviewed
    Resource model lifts MoE training efficiency 2-3.5X

    Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism

    Sajal Dash +1

  30. cs.DC 2026-05-06 reviewed
    DPU offload gives 1.55x speedup but 625x more DRAM traffic

    Communication Offloading on SmartNIC DPUs: A Quantitative Approach

    Jacob Wahlgren +4

  31. cs.DC 2026-05-06 reviewed
    DPU offload delivers 1.55x speedup when memory-to-comm ratio is high

    Communication Offloading on SmartNIC DPUs: A Quantitative Approach

    Jacob Wahlgren +4

  32. cs.CY 2026-05-06 reviewed
    Digital twin trust maps to four integration patterns across domains

    Trustworthiness in Digital Twin Systems: Systematic Review and Research Horizons

    Chi Fai David Lam (1) +3

  33. cs.DC 2026-05-06 reviewed
    Satellite AI cuts delays 32 percent with model collaboration

    Delay-Aware Large-Small Model Collaboration over LEO Satellite Networks

    Mingyu Guo +4

  34. cs.DC 2026-05-06 reviewed
    CCL-D pinpoints slow and hang anomalies in 4000-GPU clusters within 6 minutes

    CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training

    Yida Gu +19

  35. cs.PF 2026-05-06 reviewed
    LLM agents turn GPU profiles into optimization advice

    KEET: Explaining Performance of GPU Kernels Using LLM Agents

    Joshua H. Davis +7

  36. cs.DC 2026-05-06 reviewed
    Adaptive HBM split cuts recommender P99 latency 24-38%

    One Pool, Two Caches: Adaptive HBM Partitioning for Accelerating Generative Recommender Serving

    Wenjun Yu +2

  37. cs.DC 2026-05-05 reviewed
    Coral cuts multi-LLM serving costs by up to 2.79x on mixed GPUs

    Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs

    Yixuan Mei +7

  38. physics.comp-ph 2026-05-05 reviewed
    GPU code speeds moving-boundary fluid simulations 20X

    GPU-Accelerated Simulations of Problems with Moving Boundaries and Fluid-Structure Interaction at Extreme Scales

    Sushrut Kumar +4

  39. cs.NI 2026-05-05 reviewed
    MRC lets AI training survive network faults by spraying across paths

    Resilient AI Supercomputer Networking using MRC and SRv6

    Joao Araujo +49

  40. cs.DC 2026-05-05 reviewed
    Serverless orchestration breaks in LEO continua

    Orchestrating Serverless Applications in the Edge Cloud Space Continuum: What Breaks and What is Next?

    Hadi Tabatabaee Malazi +3

  41. cs.DC 2026-05-05 reviewed
    ClusterLess cuts edge workflow times by up to 40%

    ClusterLess: Deadline-Aware Serverless Workflow Orchestration on Federated Edge Clusters

    Reza Farahani +6

  42. cs.CR 2026-05-05 reviewed
    Ledger stores CP-ABE keys so IoT users decrypt locally and revoke by epoch rotation

    Revocation-Ready CP-ABE Key Management for Blockchain-Based IoT Data Sharing

    Chun Yin Chiu

  43. cs.DC 2026-05-05 reviewed
    Control plane unifies physical neural networks across materials

    phys-MCP: A Control Plane for Heterogeneous Physical Neural Networks

    Stefan Fischer +2

  44. eess.SY 2026-05-05 reviewed
    Power grids need fast and slow thinking to handle renewables

    Thinking fast and slow -- a cognitive inspired framework for decision intelligence for power systems

    Apoorv Mathur

  45. eess.SY 2026-05-05 reviewed
    Cognitive models structure power grid decisions across timescales

    Thinking fast and slow -- a cognitive inspired framework for decision intelligence for power systems

    Apoorv Mathur

  46. cs.OS 2026-05-05 reviewed
    Pub/sub smart pointer limits reference updates to 0-1 per subscriber

    ipc_shared_ptr: A Publish/Subscribe-Aware Smart Pointer for Cross-Process Object Lifetime Management

    Takahiro Ishikawa-Aso +4

  47. cs.DC 2026-05-05 reviewed
    Microbenchmark models predict GPU performance with 1% error

    Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures

    Aaron Jarmusch +1

  48. cs.DC 2026-05-05 reviewed
    True Sessions enable scalable MPI initialization in MPICH

    Implementing True MPI Sessions and Evaluating MPI Initialization Scalability

    Hui Zhou +4

  49. cs.NI 2026-05-05 reviewed
    Federated learning fails at 5s latency due to TCP handshake timeouts

    Surviving the Edge: Federated Learning under Networking and Resource Constraints

    Mike Mwanje +3

  50. cs.DC 2026-05-05 reviewed
    HPC workflows pause for human input without idling compute resources

    A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments

    Sergio Mendoza +7