pith. sign in

Canonical reference

Tender: Accelerating large language models via tensor decomposition and runtime requantization,

Canonical reference. 87% of citing Pith papers cite this work as background.

57 Pith papers citing it
Background 87% of classified citations

citation-role summary

background 13 baseline 1 dataset 1

citation-polarity summary

years

2026 52 2025 5

clear filters

representative citing papers

Latency Prediction for LLM Inference on NPU Systems

cs.DC · 2026-06-16 · unverdicted · novelty 7.0

LENS predicts NPU LLM inference latency with 2.15% mean error by profiling each bucket with two E2E measurements and composing results to capture bucketing non-linearity.

Scalable Concurrent Queues for GPU

cs.DC · 2026-06-01 · unverdicted · novelty 7.0

Introduces three linearizable GPU concurrent queues: an adapted wait-free queue using segments, a bounded lock-free queue with wave-batched paths, and a bounded wait-free queue using 64-bit CAS operations.

DiLaServe: High SLO Attainment Serving for Diffusion Language Models

cs.LG · 2026-06-27 · unverdicted · novelty 6.0

DiLaServe improves SLO attainment for diffusion language models by up to 56.6 percentage points and reduces latency by up to 46% with less than 1% accuracy drop via deadline-aware scheduling and dynamic reconfiguration.

KernelSight-LM: A Kernel-Level LLM Inference Simulator

cs.PF · 2026-06-26 · unverdicted · novelty 6.0

KernelSight-LM simulates token-level LLM inference to predict per-kernel latencies and end-to-end metrics (TTFT, TPOT, throughput) with 12.1% and 3.8% kernel errors in cross-generation and target-measured tiers.

Designing Datacenter Power Delivery Hierarchies for the AI Era

cs.DC · 2026-05-15 · unverdicted · novelty 6.0

Develops a simulation framework showing multi-resource stranding changes deployable capacity and effective costs in AI datacenters, arguing the key metric is deployable capacity over time rather than installed megawatts.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.