SAMoRA is a parameter-efficient fine-tuning framework that uses semantic-aware routing and task-adaptive scaling within a Mixture of LoRA Experts to improve multi-task performance and generalization over prior methods.
Pangu-{\Sigma}: Towards trillion parameter language model with sparse heterogeneous comput- ing
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Local Platt scaling on three fine-grained confidence scores reduces calibration error for LLM-based automated code revision across tasks and models compared to global scaling alone.
DeepSeekMoE 2B matches GShard 2.9B performance and approaches a dense 2B model; the 16B version matches LLaMA2-7B at 40% compute by using fine-grained expert segmentation plus shared experts.
This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.
citing papers explorer
-
SAMoRA: Semantic-Aware Mixture of LoRA Experts for Task-Adaptive Learning
SAMoRA is a parameter-efficient fine-tuning framework that uses semantic-aware routing and task-adaptive scaling within a Mixture of LoRA Experts to improve multi-task performance and generalization over prior methods.
-
Fine-grained Approaches for Confidence Calibration of LLMs in Automated Code Revision
Local Platt scaling on three fine-grained confidence scores reduces calibration error for LLM-based automated code revision across tasks and models compared to global scaling alone.
-
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
DeepSeekMoE 2B matches GShard 2.9B performance and approaches a dense 2B model; the 16B version matches LLaMA2-7B at 40% compute by using fine-grained expert segmentation plus shared experts.
-
A Survey of Large Language Models
This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.
-
A Comprehensive Overview of Large Language Models
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.