Causal2Vec prepends a BERT-generated contextual token to decoder-only LLMs and pools its hidden state with the EOS token to reach new SOTA on MTEB among public-data-trained embedding models.
InProceedings of the 44th Inter- national ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2356– 2362
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
R2MED is the first benchmark for reasoning-driven medical retrieval, where even top models reach only 41.4 nDCG@10 on queries requiring inference beyond lexical or semantic overlap.
RRK compresses documents to multi-token embeddings for efficient listwise reranking, enabling an 8B model to achieve 3x-18x speedups over smaller models with comparable or better effectiveness.
citing papers explorer
-
Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token
Causal2Vec prepends a BERT-generated contextual token to decoder-only LLMs and pools its hidden state with the EOS token to reach new SOTA on MTEB among public-data-trained embedding models.
-
R2MED: A Benchmark for Reasoning-Driven Medical Retrieval
R2MED is the first benchmark for reasoning-driven medical retrieval, where even top models reach only 41.4 nDCG@10 on queries requiring inference beyond lexical or semantic overlap.
-
Efficient Listwise Reranking with Compressed Document Representations
RRK compresses documents to multi-token embeddings for efficient listwise reranking, enabling an 8B model to achieve 3x-18x speedups over smaller models with comparable or better effectiveness.