Canonical reference

Title resolution pending

Shiwei Gao, Youmin Chen, Jiwu Shu · 2025

Canonical reference. 80% of citing Pith papers cite this work as background.

5 Pith papers citing it

Background 80% of classified citations

browse 5 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 4 method 1

citation-polarity summary

background 4 use method 1

representative citing papers

Efficient Training on Multiple Consumer GPUs with RoundPipe

cs.DC · 2026-04-29 · conditional · novelty 8.0

RoundPipe achieves near-zero-bubble pipeline parallelism for LLM training on consumer GPUs by dynamically dispatching computation stages round-robin, yielding 1.48-2.16x speedups and enabling 235B model fine-tuning on 8x RTX 4090.

Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

cs.CL · 2026-04-27 · unverdicted · novelty 7.0

DataPRM is a new process reward model for data analysis agents that detects silent errors via environment interaction and ternary rewards, yielding 7-11% gains on benchmarks and further RL improvements.

COPUS: Co-adaptive Parallelism and Batch Size Selection in Large Language Model Training

cs.DC · 2026-04-29 · unverdicted · novelty 6.0

COPUS co-adapts batch size and parallelism during LLM training via goodput to deliver 3.9-8% average faster convergence than fixing one while tuning the other.

RetroInfer: A Vector Storage Engine for Scalable Long-Context LLM Inference

cs.LG · 2025-05-05 · conditional · novelty 6.0

RetroInfer introduces the wave index and wave buffer to realize sparse KV-cache attention for long-context LLM inference with up to 4.4X throughput gains while matching full-attention accuracy.

Strait: Perceiving Priority and Interference in ML Inference Serving

cs.LG · 2026-04-30 · unverdicted · novelty 5.0

Strait cuts high-priority deadline violations in ML inference serving by 1-11 percentage points through contention modeling and priority scheduling under high GPU load.

citing papers explorer

Showing 5 of 5 citing papers.

Efficient Training on Multiple Consumer GPUs with RoundPipe cs.DC · 2026-04-29 · conditional · none · ref 49
RoundPipe achieves near-zero-bubble pipeline parallelism for LLM training on consumer GPUs by dynamically dispatching computation stages round-robin, yielding 1.48-2.16x speedups and enabling 235B model fine-tuning on 8x RTX 4090.
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis cs.CL · 2026-04-27 · unverdicted · none · ref 51
DataPRM is a new process reward model for data analysis agents that detects silent errors via environment interaction and ternary rewards, yielding 7-11% gains on benchmarks and further RL improvements.
COPUS: Co-adaptive Parallelism and Batch Size Selection in Large Language Model Training cs.DC · 2026-04-29 · unverdicted · none · ref 45
COPUS co-adapts batch size and parallelism during LLM training via goodput to deliver 3.9-8% average faster convergence than fixing one while tuning the other.
RetroInfer: A Vector Storage Engine for Scalable Long-Context LLM Inference cs.LG · 2025-05-05 · conditional · none · ref 29
RetroInfer introduces the wave index and wave buffer to realize sparse KV-cache attention for long-context LLM inference with up to 4.4X throughput gains while matching full-attention accuracy.
Strait: Perceiving Priority and Interference in ML Inference Serving cs.LG · 2026-04-30 · unverdicted · none · ref 53
Strait cuts high-priority deadline violations in ML inference serving by 1-11 percentage points through contention modeling and priority scheduling under high GPU load.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer