Title resolution pending

8-bit Optimizers via Block-wise Quantization , author= · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

The Falcon Series of Open Language Models

cs.CL · 2023-11-28 · conditional · novelty 6.0

Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.

ST-MoE: Designing Stable and Transferable Sparse Expert Models

cs.CL · 2022-02-17 · unverdicted · novelty 6.0

ST-MoE introduces stability techniques for sparse expert models, allowing a 269B-parameter model to achieve state-of-the-art transfer learning results across reasoning, summarization, and QA tasks at the compute cost of a 32B dense model.

citing papers explorer

Showing 1 of 1 citing paper after filters.

The Falcon Series of Open Language Models cs.CL · 2023-11-28 · conditional · none · ref 77
Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer