C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
PyTorch Fully Sharded Data Parallel enables training of significantly larger models than Distributed Data Parallel with comparable speed and near-linear TFLOPS scaling.
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.
citing papers explorer
-
C-Pack: Packed Resources For General Chinese Embeddings
C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.
-
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
PyTorch Fully Sharded Data Parallel enables training of significantly larger models than Distributed Data Parallel with comparable speed and near-linear TFLOPS scaling.
-
Data-Centric Foundation Models in Computational Healthcare: A Survey
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.