pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1164 papers in cs.DC · page 16

  1. cs.DC 2026-03-14 reviewed
    Grassroots bonds turn community trust into interest-bearing liquidity

    Grassroots Bonds as a Foundation for Market Liquidity

    Ehud Shapiro

  2. cs.DC 2026-03-13 reviewed
    Token-budget routing cuts LLM GPU fleet 17-39%

    Token-Budget-Aware Pool Routing for Cost-Efficient LLM Inference

    Huamin Chen +4

  3. cs.DC 2026-03-13 reviewed
    Engine runs 1,200-node graphs after one agent call

    Separating Intelligence from Execution: A Workflow Engine for the Model Context Protocol

    Abhinav Singh Parmar

  4. cs.LG 2026-03-12 reviewed
    Cornserve boosts any-to-any model serving by 3.81x throughput

    Cornserve: A Distributed Serving System for Any-to-Any Multimodal Models

    Jae-Won Chung +6

  5. cs.DC 2026-03-12 reviewed
    Multi-agent RL with graphs beats default Kubernetes scheduler

    AGMARL-DKS: An Adaptive Graph-Enhanced Multi-Agent Reinforcement Learning for Dynamic Kubernetes Scheduling

    Hamed Hamzeh

  6. cs.DC 2026-03-12 reviewed
    Batch size cuts energy in LLM workflows but only for certain tasks

    Characterizing Performance-Energy Trade-offs of Large Language Models in Multi-Request Workflows

    Md. Monzurul Amin Ifath +1

  7. cs.DC 2026-03-12 reviewed
    NCCLbpf adds verified eBPF policies to NCCL plugins with 130 ns overhead

    NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication

    Yusheng Zheng

  8. cs.LG 2026-03-11 reviewed
    Scheduler cuts multi-job federated learning time by 8.3x

    FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources

    Md Sirajul Islam +6

  9. cs.CR 2026-03-11 reviewed
    PrefixWall raises LLM cache reuse 70% over isolation

    PrefixWall: Mitigating Prefix Caching Side Channels in Shared LLM Systems

    Panagiotis Georgios Pennas +3

  10. cs.DC 2026-03-11 reviewed
    Ozaki-II adapted to FP8 cuts cost of double-precision matrix emulation

    Double-Precision Matrix Multiplication Emulation via Ozaki-II Scheme with FP8 Quantization

    Yuki Uchino +2

  11. cs.DC 2026-03-11 reviewed
    Cloud LLM creates and pushes adaptive code to edge devices

    LLM-assisted Agentic Edge Intelligence Framework

    Chinmaya Kumar Dehury +4

  12. cs.DC 2026-03-10 reviewed
    Flash-KMeans runs exact GPU k-means 18x faster

    Flash-KMeans: Fast and Memory-Efficient Exact K-Means

    Shuo Yang +12

  13. cs.DC 2026-03-10 reviewed
    Sparse gating turns LLM batches into elastic super-trees for 5x speedup

    ECHO: Elastic Speculative Decoding with Sparse Gating for High-Concurrency Scenarios

    Xinyi Hu +8

  14. cs.DC 2026-03-10 reviewed
    FP64 tensor cores speed finite-element kernels 2x

    Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor Cores

    Jiqun Tu +6

  15. cs.DC 2026-03-08 reviewed
    ArcLight raises CPU LLM throughput by 46% via NUMA control

    ArcLight: A Lightweight LLM Inference Architecture for Many-Core CPUs

    Yuzhuang Xu +3

  16. cs.AI 2026-03-08 reviewed
    Graph engine runs LLM agents with zero hallucinations

    GraphBit: A Graph-based Agentic Framework for Non-Linear Agent Orchestration

    Yeahia Sarker +3

  17. cs.DC 2026-03-08 reviewed
    Potential games and LLM weights optimize UAV networks

    Agentic AI-Driven UAV Network Deployment: An LLM-Enhanced Exact Potential Game Approach

    Xin Tang +8

  18. cs.DC 2026-03-07 reviewed
    ML duration predictor trims supercomputer job waits by 11%

    Duration-Informed Workload Scheduler

    Daniela Loreti +2

  19. cs.DC 2026-03-07 reviewed
    Simulator tests failure knobs for large AI clusters

    AIReSim: A Discrete Event Simulator for Large-scale AI Cluster Reliability Modeling

    Karthik Pattabiraman +2

  20. cs.DC 2026-03-06 reviewed
    OMA retains Kubernetes crash evidence past the evidence horizon

    Operational Memory Architecture for Kubernetes:Preserving Causal Context Across the Evidence Horizon

    Shamsher Khan

  21. cs.DC 2026-03-06 reviewed
    DMM merges divergent models data-free using normalization stats

    Domain-Adaptive Model Merging Across Disconnected Modes

    Junming Liu +4

  22. cs.DC 2026-03-05 reviewed
    Misaligned dimensions keep compressed LLMs from speeding up

    Why Smaller Is Slower? Dimensional Misalignment in Compressed LLMs

    Jihao Xin +4

  23. quant-ph 2026-03-04 reviewed
    Heron beats Eagle in protocol benchmarks for quantum advantage

    Benchmarking Quantum Computers via Protocols, Comparing IBM's Heron vs IBM's Eagle

    Nitay Mayo +2

  24. cs.DC 2026-03-04 reviewed
    Benchmark suite derives efficiency rules for compound AI

    Benchmarking Compound AI Applications for Hardware-Software Co-Design

    Paramuth Samuthrsindh +5

  25. cs.DC 2026-03-04 reviewed
    Planning system decides satellite vs ground tasks to fit data transfers

    Constraint-Aware Execution Planning for Hybrid Space-Ground Compute Workloads

    Subhadip Mitra

  26. quant-ph 2026-03-04 reviewed
    Beam search reduces quantum communication costs in circuit partitioning

    Efficient Time-Aware Partitioning of Quantum Circuits for Distributed Quantum Computing

    Raymond P. H. Wu +5

  27. cs.DC 2026-03-04 reviewed
    Unified objects automate IoT edge-cloud apps with 9 nines availability

    EdgeWeaver: Accelerating IoT Application Development Across Edge-Cloud Continuum

    Pawissanutt Lertpongrujikorn +3

  28. cs.DC 2026-03-04 reviewed
    Fixed encoding decodes data 9-213× faster than Protocol Buffers

    Simplicity Scales

    Andrew Sampson (6OVER3 Institute) +2

  29. quant-ph 2026-03-03 reviewed
    Gate fusion speeds quantum ML simulation by 20 times

    Fast and memory-efficient classical simulation of quantum machine learning via forward and backward gate fusion

    Yoshiaki Kawase

  30. cs.RO 2026-03-03 reviewed
    The paper introduces the cuNRTO framework with two new CUDA-based architectures

    cuNRTO: GPU-Accelerated Nonlinear Robust Trajectory Optimization

    Jiawei Wang +2

  31. cs.DC 2026-03-01 reviewed
    Filecoin reaches 2^{-30} finality in 30 rounds not 900

    The Finality Calculator: Analyzing and Quantifying Filecoin's Finality Guarantees

    Guy Goren +1

  32. cs.DC 2026-02-27 reviewed
    SPARe keeps fault-tolerance overhead at 2-3x for 100k GPU LLM training

    SPARe: Stacked Parallelism with Adaptive Reordering for Fault-Tolerant LLM Pretraining Systems with 100k+ GPUs

    Jin Lee +8

  33. cs.LG 2026-02-27 reviewed
    Perturbed model copies enable private LLM unlearning

    MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

    Tiantong Wang +5

  34. cs.CR 2026-02-26 reviewed
    Protocol outsources MSM with 300x faster verification

    2G2T: Constant-Size, Statistically Sound MSM Outsourcing

    Majid Khabbazian

  35. cs.LG 2026-02-26 reviewed
    Shared caching cuts edge LLM first-token time by 93%

    Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

    Hiroki Matsutani +2

  36. cs.DC 2026-02-25 reviewed
    CXL memory pool beats InfiniBand on GPU collectives

    CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

    Dong Xu (1) +16

  37. cs.DC 2026-02-25 reviewed
    Flexible sharding lifts FSDP speed by up to 66% at 10k GPUs

    veScale-FSDP: Flexible and High-Performance FSDP at Scale

    Zezhou Wang +11

  38. cs.RO 2026-02-24 reviewed
    GPU hybrid matches top solvers on large multi-depot routing

    A GPU-Accelerated Hybrid Method for a Class of Multi-Depot Vehicle Routing Problems

    Zhenyu Lei +1

  39. cs.DC 2026-02-24 reviewed
    Morton curve defined for pyramids in hybrid AMR

    A Morton-Type Space-Filling Curve for Pyramid Subdivision and Hybrid Adaptive Mesh Refinement

    David Knapp +4

  40. cs.DC 2026-02-22 reviewed
    Semantic dependencies resolve data conflicts locally via rebasing

    Semantic Conflict Model for Collaborative Data Structures

    Georgii Semenov +1

  41. cs.DC 2026-02-21 reviewed
    DualScale cuts energy up to 48% in LLM decode phase

    DualScale: Energy-Efficient Disaggregated LLM Serving via Phase-Aware Placement and DVFS

    Omar Basit +3

  42. cs.MA 2026-02-21 reviewed
    74% of workflows need no coordination for correctness

    When Coordination Is Avoidable: A Monotonicity Analysis of Organizational Tasks

    Harang Ju

  43. cs.DC 2026-02-19 reviewed
    GPU memory estimators fail to generalize across hardware

    GPU Memory and Utilization Estimation for Training-Aware Resource Management: Opportunities and Limitations

    Ehsan Yousefzadeh-Asl-Miandoab +4

  44. cs.DC 2026-02-19 reviewed
    SwapLess cuts Edge TPU latency up to 77% via CPU-TPU partitioning

    Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs

    Nathan Ng +7

  45. cs.DC 2026-02-18 reviewed
    Prebuilt hypertree removes locks from parallel node generation

    Load Balanced Parallel Node Generation for Meshless Numerical Methods

    Jon Vehovar +3

  46. cs.DC 2026-02-18 reviewed
    Circuit cutting trains QNNs on distributed systems without losing accuracy

    DistributedEstimator: Distributed Training of Quantum Neural Networks via Circuit Cutting

    Prabhjot Singh +2

  47. cs.LG 2026-02-17 reviewed
    Cloud inference matches on-device for real-time braking

    Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

    Pragya Sharma +2

  48. cs.DC 2026-02-15 reviewed
    Baremetal runtime lifts AI efficiency 9x on 10x fewer tiles

    AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators

    Hua Jiang +3

  49. cs.MS 2026-02-15 reviewed
    Direct solvers scale via communication cuts and low-rank compression

    Parallel Sparse and Data-Sparse Factorization-based Linear Solvers

    Xiaoye Sherry Li +1

  50. cs.DC 2026-02-14 reviewed
    Energy use shifts from linear to root function as core count rises

    The Impact of Process Competition on Energy Consumption: Analysis and Modeling

    Eduardo Gomes Campos +5