MAPL learns task-specific orthogonal compression subspaces per pipeline stage via manifold-constrained optimization and recovers signals with low-overhead anchors, yielding better compression-performance tradeoffs than fixed projections on LLaMA models up to 1B parameters.
Covenant-72b: Pre-training a 72b llm with trustless peers over-the-internet.arXiv preprint arXiv:2603.08163, 2026
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Distributed training may enable evasion of cluster-based compute governance for frontier AI, requiring new detection approaches such as chip tracking and cluster thresholds.
citing papers explorer
-
Learned Subspace Compression for Communication-Efficient Pipeline Parallelism
MAPL learns task-specific orthogonal compression subspaces per pipeline stage via manifold-constrained optimization and recovers signals with low-overhead anchors, yielding better compression-performance tradeoffs than fixed projections on LLaMA models up to 1B parameters.
-
Does Distributed Training Undermine Compute Governance?
Distributed training may enable evasion of cluster-based compute governance for frontier AI, requiring new detection approaches such as chip tracking and cluster thresholds.