Cleave trains foundation models on heterogeneous edge devices by decomposing GEMM operations to exploit downlink-uplink asymmetry, achieving cloud-comparable speed and scaling to thousands of devices with fast failure recovery.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2025 2verdicts
UNVERDICTED 2representative citing papers
PreScope combines a layer-aware activation predictor, cross-layer prefetch scheduling, and asynchronous I/O to deliver 141% higher throughput and 74.6% lower latency for MoE inference on legacy hardware.
citing papers explorer
-
On Harnessing Idle Compute at the Edge for Foundation Model Training
Cleave trains foundation models on heterogeneous edge devices by decomposing GEMM operations to exploit downlink-uplink asymmetry, achieving cloud-comparable speed and scaling to thousands of devices with fast failure recovery.
-
LayerScope: Predictive Cross-Layer Scheduling for Efficient Multi-Batch MoE Inference on Legacy Servers
PreScope combines a layer-aware activation predictor, cross-layer prefetch scheduling, and asynchronous I/O to deliver 141% higher throughput and 74.6% lower latency for MoE inference on legacy hardware.