Nomic AI produced and open-sourced a reproducible 8192-context English text embedder that exceeds OpenAI Ada-002 and text-embedding-3-small performance on MTEB short-context and LoCo long-context benchmarks.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.
MASS-RAG uses distinct agents for evidence summarization, extraction, and reasoning, then synthesizes their outputs to improve answer quality over standard RAG baselines on four benchmarks, especially when evidence is distributed.
citing papers explorer
-
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nomic AI produced and open-sourced a reproducible 8192-context English text embedder that exceeds OpenAI Ada-002 and text-embedding-3-small performance on MTEB short-context and LoCo long-context benchmarks.
-
Atlas: Few-shot Learning with Retrieval Augmented Language Models
Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.
-
MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation
MASS-RAG uses distinct agents for evidence summarization, extraction, and reasoning, then synthesizes their outputs to improve answer quality over standard RAG baselines on four benchmarks, especially when evidence is distributed.
- RAG over Thinking Traces Can Improve Reasoning Tasks