SimReg regularization accelerates LLM pretraining convergence by over 30% and raises average zero-shot performance by over 1% across benchmarks.
Next token prediction towards multimodal intelligence: A comprehensive survey.arXiv preprint arXiv:2412.18619, 2024a
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
A literature survey that organizes spoken language models by architecture, training, and evaluation choices and identifies key challenges and future directions.
citing papers explorer
-
SimReg: Achieving Higher Performance in the Pretraining via Embedding Similarity Regularization
SimReg regularization accelerates LLM pretraining convergence by over 30% and raises average zero-shot performance by over 1% across benchmarks.
-
On The Landscape of Spoken Language Models: A Comprehensive Survey
A literature survey that organizes spoken language models by architecture, training, and evaluation choices and identifies key challenges and future directions.
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation