PowLU replaces SwiGLU with a rational-power activation to reduce outlier amplification and numerical instability during large-scale LLM pre-training while matching performance.
Faster and lighter LLMs: A survey on current challenges and way forward.arXiv preprint arXiv:2402.01799,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.
citing papers explorer
-
PowLU: An Activation Function for Stable Pre-Training of LLMs
PowLU replaces SwiGLU with a rational-power activation to reduce outlier amplification and numerical instability during large-scale LLM pre-training while matching performance.