pith. sign in

Demons in the detail: On implementing load balancing loss for training specialized mixture-of-expert models, 2025 a

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

years

2026 3 2025 4

clear filters

representative citing papers

PithTrain: A Compact and Agent-Native MoE Training System

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

PithTrain is a compact agent-native MoE training system that matches production throughput and improves agent-task efficiency by up to 62% fewer turns and 64% less GPU time on the new ATE-Bench.

ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution

cs.CL · 2025-09-17 · unverdicted · novelty 6.0

ShinkaEvolve improves sample efficiency in LLM-driven program evolution via parent sampling, code novelty rejection-sampling, and bandit LLM ensemble selection, achieving new SOTA circle packing with 150 samples and gains on math reasoning and competitive programming tasks.

Qwen3 Technical Report

cs.CL · 2025-05-14 · unverdicted · novelty 5.0

Pith review generated a malformed one-line summary.

citing papers explorer

Showing 1 of 1 citing paper after filters.