super hub Mixed citations

Text Embeddings by Weakly-Supervised Contrastive Pre-training

Binxing Jiao, Daxin Jiang, Liang Wang, Linjun Yang, Nan Yang, Xiaolong Huang · 2022 · cs.CL · arXiv 2212.03533

Mixed citation behavior. Most common role is method (39%).

155 Pith papers citing it

Method 39% of classified citations

open full Pith review browse 155 citing papers more from Binxing Jiao arXiv PDF

abstract

This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks. The model is trained in a contrastive manner with weak supervision signals from our curated large-scale text pair dataset (called CCPairs). E5 can be readily used as a general-purpose embedding model for any tasks requiring a single-vector representation of texts such as retrieval, clustering, and classification, achieving strong performance in both zero-shot and fine-tuned settings. We conduct extensive evaluations on 56 datasets from the BEIR and MTEB benchmarks. For zero-shot settings, E5 is the first model that outperforms the strong BM25 baseline on the BEIR retrieval benchmark without using any labeled data. When fine-tuned, E5 obtains the best results on the MTEB benchmark, beating existing embedding models with 40x more parameters.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 10 method 9 other 2 baseline 1 dataset 1

citation-polarity summary

use method 9 background 8 support 2 unclear 2 baseline 1 use dataset 1

claims ledger

abstract This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks. The model is trained in a contrastive manner with weak supervision signals from our curated large-scale text pair dataset (called CCPairs). E5 can be readily used as a general-purpose embedding model for any tasks requiring a single-vector representation of texts such as retrieval, clustering, and classification, achieving strong performance in both zero-shot and fine-tuned settings. We conduct extensive evaluations on 56 datasets from the BEIR and MTEB benchmarks. For zero-shot se

authors

Binxing Jiao Daxin Jiang Liang Wang Linjun Yang Nan Yang Xiaolong Huang

co-cited works

representative citing papers

Is Dimensionality a Barrier for Retrieval Models?

cs.LG · 2026-05-22 · unverdicted · novelty 8.0

Dimension d = O(m^{-2} log n) nearly achieves the optimal margin m^rd(+∞, A) for retrieval embeddings, with matching lower bounds showing d = O(k log(n/k)) suffices and is necessary for m = Θ(k^{-1/2}) on k-sparse query matrices.

STRABLE: Benchmarking Tabular Machine Learning with Strings

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.

FollowTable: A Benchmark for Instruction-Following Table Retrieval

cs.IR · 2026-05-01 · unverdicted · novelty 8.0

FollowTable is the first large-scale benchmark for instruction-following table retrieval, paired with an Instruction Responsiveness Score, showing that existing models fail to adapt to fine-grained constraints beyond topical similarity.

Diagnosing and Mitigating Retrieval Bottlenecks in LLM-Based Cold-Start Recommendation

cs.IR · 2026-06-29 · conditional · novelty 7.0

Retrieval coverage limits LLM rerankers in cold-start recommendation; a learned hybrid fusion improves pool quality but LLM reranking often degrades end-to-end performance while simpler rankers exploit the pool.

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

cs.CL · 2026-06-28 · conditional · novelty 7.0

Anisotropy, quantified by dominant-dimension variance fraction, determines the best parameter-free similarity metric for text embeddings, with rank-based metrics gaining ~20% relative where cosine is weakest.

Agentic Abstention: Do Agents Know When to Stop Instead of Act?

cs.AI · 2026-06-27 · unverdicted · novelty 7.0

LLM agents often fail to abstain at the right time in uncertain multi-turn tasks, and the CONVOLVE context engineering method raises timely abstention rates on WebShop from 26.7 to 57.4 without parameter updates.

A Sensitivity-Aware Test Collection for Search Among Personal Information

cs.IR · 2026-06-25 · accept · novelty 7.0

A new sensitivity-labeled test collection is released from Enron emails with crowdsourced queries, relevance judgments, and LLM extensions for evaluating sensitivity-aware search.

The Voronoi Bottleneck: Capacity-Aware Dense Retrieval for Product Search

cs.IR · 2026-06-09 · unverdicted · novelty 7.0

Proves Voronoi complexity equals sign-rank for top-1 retrieval, introduces CUS diagnostic predicting retrieval failure at AUC >0.8 without labels, and AT-DW-InfoNCE objective with derived alpha^*=2.0 that improves Recall@100 on synthetic data.

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

cs.CL · 2026-05-30 · unverdicted · novelty 7.0

SelSkill applies dual-granularity preference learning to selective skill-or-skip decisions, improving task success by 10.9 points and execution precision by 29.1 points on ALFWorld with Qwen3-8B.

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

cs.CR · 2026-05-28 · unverdicted · novelty 7.0

MemPoison enables stealthy memory poisoning in LLM agents via dialogue by using semantic relational bridges, entity masquerading, and joint embedding optimization to bypass selective extraction and rewriting, achieving up to 0.95 attack success rate.

Towards Cost-effective LLMs Routing with Batch Prompting

cs.DB · 2026-05-27 · unverdicted · novelty 7.0

RoBatch is a two-stage framework that formulates and solves the joint Route with Batching Problem via a batch-aware proxy utility model and greedy scheduling, outperforming separate routing or batching baselines on six benchmarks.

The Harder Text Embedding Benchmark (HTEB): Beyond One-dimensional Static Robustness

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

HTEB introduces dynamic, multi-axis evaluation of text embedding robustness using LLM transformations, finding decoupled profiles across models and that scaling does not close all robustness gaps.

IdioLink: Retrieving Meaning Beyond Words Across Idiomatic and Literal Expressions

cs.CL · 2026-05-21 · unverdicted · novelty 7.0

IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.

Generative Conversational Recommender System

cs.IR · 2026-05-21 · unverdicted · novelty 7.0

A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.

Linked Multi-Model Data on Russian Domestic and Foreign Policy Speeches

cs.CL · 2026-05-15 · unverdicted · novelty 7.0

A new linked multimodal dataset of Russian domestic and foreign policy speeches with texts, images, captions, harmonized metadata, and expert-refined topic annotations is introduced to support analyses in political communication and LLM applications.

Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

TWN attaches separate reasoning and embedding LoRA adapters to a frozen backbone with gradient detachment and a self-supervised gate that decides per input whether to generate CoT, achieving SOTA on MMEB-V2 with 3-5% added parameters and up to 50% fewer reasoning tokens.

Very Efficient Listwise Multimodal Reranking for Long Documents

cs.IR · 2026-05-12 · unverdicted · novelty 7.0

ZipRerank delivers state-of-the-art multimodal listwise reranking accuracy for long documents at up to 10x lower latency via early interaction and single-pass scoring.

Breaking $\textit{Winner-Takes-All}$: Cooperative Policy Optimization Improves Diverse LLM Reasoning

cs.AI · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

GCPO uses team-level credit assignment via determinant volume over reward-weighted semantic embeddings to promote non-redundant correct reasoning paths, improving both accuracy and diversity in LLM training.

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.

Skill Description Deception Attack against Task Routing in Internet of Agents

cs.MA · 2026-05-11 · conditional · novelty 7.0

Malicious agents can deceive LLM-based task routers in Internet of Agents systems by generating fake skill descriptions, achieving up to 98% success rate across nine domains.

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

cs.CL · 2026-05-07 · unverdicted · novelty 7.0

LatentRAG performs agentic RAG by generating latent tokens for thoughts and subqueries in one forward pass, matching explicit methods' accuracy on seven benchmarks while reducing latency by ~90%.

SkillRet: A Large-Scale Benchmark for Skill Retrieval in LLM Agents

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

SkillRet benchmark shows fine-tuned retrievers improve NDCG@10 by 13+ points over prior models on large-scale skill retrieval for LLM agents.

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.

Embedding-based In-Context Prompt Training for Enhancing LLMs as Text Encoders

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

EPIC trains LLMs to treat continuous embeddings as in-context prompts, yielding state-of-the-art text embedding performance on MTEB with or without prompts at inference and lower compute.

citing papers explorer

Showing 50 of 155 citing papers.

Is Dimensionality a Barrier for Retrieval Models? cs.LG · 2026-05-22 · unverdicted · none · ref 167 · internal anchor
Dimension d = O(m^{-2} log n) nearly achieves the optimal margin m^rd(+∞, A) for retrieval embeddings, with matching lower bounds showing d = O(k log(n/k)) suffices and is necessary for m = Θ(k^{-1/2}) on k-sparse query matrices.
STRABLE: Benchmarking Tabular Machine Learning with Strings cs.LG · 2026-05-12 · unverdicted · none · ref 64 · internal anchor
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
FollowTable: A Benchmark for Instruction-Following Table Retrieval cs.IR · 2026-05-01 · unverdicted · none · ref 49 · internal anchor
FollowTable is the first large-scale benchmark for instruction-following table retrieval, paired with an Instruction Responsiveness Score, showing that existing models fail to adapt to fine-grained constraints beyond topical similarity.
Diagnosing and Mitigating Retrieval Bottlenecks in LLM-Based Cold-Start Recommendation cs.IR · 2026-06-29 · conditional · none · ref 23 · internal anchor
Retrieval coverage limits LLM rerankers in cold-start recommendation; a learned hybrid fusion improves pool quality but LLM reranking often degrades end-to-end performance while simpler rankers exploit the pool.
Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings cs.CL · 2026-06-28 · conditional · none · ref 24 · internal anchor
Anisotropy, quantified by dominant-dimension variance fraction, determines the best parameter-free similarity metric for text embeddings, with rank-based metrics gaining ~20% relative where cosine is weakest.
Agentic Abstention: Do Agents Know When to Stop Instead of Act? cs.AI · 2026-06-27 · unverdicted · none · ref 86 · internal anchor
LLM agents often fail to abstain at the right time in uncertain multi-turn tasks, and the CONVOLVE context engineering method raises timely abstention rates on WebShop from 26.7 to 57.4 without parameter updates.
A Sensitivity-Aware Test Collection for Search Among Personal Information cs.IR · 2026-06-25 · accept · none · ref 67 · internal anchor
A new sensitivity-labeled test collection is released from Enron emails with crowdsourced queries, relevance judgments, and LLM extensions for evaluating sensitivity-aware search.
The Voronoi Bottleneck: Capacity-Aware Dense Retrieval for Product Search cs.IR · 2026-06-09 · unverdicted · none · ref 16 · internal anchor
Proves Voronoi complexity equals sign-rank for top-1 retrieval, introduces CUS diagnostic predicting retrieval failure at AUC >0.8 without labels, and AT-DW-InfoNCE objective with derived alpha^*=2.0 that improves Recall@100 on synthetic data.
Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning cs.CL · 2026-05-30 · unverdicted · none · ref 48 · internal anchor
SelSkill applies dual-granularity preference learning to selective skill-or-skip decisions, improving task success by 10.9 points and execution precision by 29.1 points on ALFWorld with Qwen3-8B.
Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction cs.CR · 2026-05-28 · unverdicted · none · ref 41 · internal anchor
MemPoison enables stealthy memory poisoning in LLM agents via dialogue by using semantic relational bridges, entity masquerading, and joint embedding optimization to bypass selective extraction and rewriting, achieving up to 0.95 attack success rate.
Towards Cost-effective LLMs Routing with Batch Prompting cs.DB · 2026-05-27 · unverdicted · none · ref 31 · internal anchor
RoBatch is a two-stage framework that formulates and solves the joint Route with Batching Problem via a batch-aware proxy utility model and greedy scheduling, outperforming separate routing or batching baselines on six benchmarks.
The Harder Text Embedding Benchmark (HTEB): Beyond One-dimensional Static Robustness cs.CL · 2026-05-27 · unverdicted · none · ref 60 · internal anchor
HTEB introduces dynamic, multi-axis evaluation of text embedding robustness using LLM transformations, finding decoupled profiles across models and that scaling does not close all robustness gaps.
IdioLink: Retrieving Meaning Beyond Words Across Idiomatic and Literal Expressions cs.CL · 2026-05-21 · unverdicted · none · ref 59 · internal anchor
IdioLink introduces a benchmark dataset and evaluation showing that strong embedding models struggle to retrieve equivalent meanings across idiomatic and literal forms, relying on shallow cues instead.
Generative Conversational Recommender System cs.IR · 2026-05-21 · unverdicted · none · ref 28 · internal anchor
A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.
Linked Multi-Model Data on Russian Domestic and Foreign Policy Speeches cs.CL · 2026-05-15 · unverdicted · none · ref 58 · internal anchor
A new linked multimodal dataset of Russian domestic and foreign policy speeches with texts, images, captions, harmonized metadata, and expert-refined topic annotations is introduced to support analyses in political communication and LLM applications.
Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture cs.CV · 2026-05-14 · unverdicted · none · ref 33 · internal anchor
TWN attaches separate reasoning and embedding LoRA adapters to a frozen backbone with gradient detachment and a self-supervised gate that decides per input whether to generate CoT, achieving SOTA on MMEB-V2 with 3-5% added parameters and up to 50% fewer reasoning tokens.
Very Efficient Listwise Multimodal Reranking for Long Documents cs.IR · 2026-05-12 · unverdicted · none · ref 49 · internal anchor
ZipRerank delivers state-of-the-art multimodal listwise reranking accuracy for long documents at up to 10x lower latency via early interaction and single-pass scoring.
Breaking $\textit{Winner-Takes-All}$: Cooperative Policy Optimization Improves Diverse LLM Reasoning cs.AI · 2026-05-12 · unverdicted · none · ref 33 · 2 links · internal anchor
GCPO uses team-level credit assignment via determinant volume over reward-weighted semantic embeddings to promote non-redundant correct reasoning paths, improving both accuracy and diversity in LLM training.
MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image cs.LG · 2026-05-11 · unverdicted · none · ref 105 · internal anchor
MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.
Skill Description Deception Attack against Task Routing in Internet of Agents cs.MA · 2026-05-11 · conditional · none · ref 16 · internal anchor
Malicious agents can deceive LLM-based task routers in Internet of Agents systems by generating fake skill descriptions, achieving up to 98% success rate across nine domains.
LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG cs.CL · 2026-05-07 · unverdicted · none · ref 34 · internal anchor
LatentRAG performs agentic RAG by generating latent tokens for thoughts and subqueries in one forward pass, matching explicit methods' accuracy on seven benchmarks while reducing latency by ~90%.
SkillRet: A Large-Scale Benchmark for Skill Retrieval in LLM Agents cs.AI · 2026-05-07 · unverdicted · none · ref 28 · internal anchor
SkillRet benchmark shows fine-tuned retrievers improve NDCG@10 by 13+ points over prior models on large-scale skill retrieval for LLM agents.
TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding cs.CL · 2026-05-06 · unverdicted · none · ref 32 · internal anchor
TabEmbed is the first generalist embedding model for tabular data that unifies classification and retrieval in one space via contrastive learning and outperforms text embedding models on the new TabBench benchmark.
Embedding-based In-Context Prompt Training for Enhancing LLMs as Text Encoders cs.CL · 2026-05-02 · unverdicted · none · ref 15 · internal anchor
EPIC trains LLMs to treat continuous embeddings as in-context prompts, yielding state-of-the-art text embedding performance on MTEB with or without prompts at inference and lower compute.
Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory cs.CL · 2026-05-01 · unverdicted · none · ref 18 · internal anchor
MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.
From Static Analysis to Audience Dissemination: A Training-Free Multimodal Controversy Detection Multi-Agent Framework cs.LG · 2026-05-01 · unverdicted · none · ref 23 · internal anchor
AuDisAgent reformulates multimodal controversy detection as a dynamic audience dissemination process using screening, panel discussion, and arbitration agents, plus comment bootstrapping, and reports outperforming prior static methods on a public dataset.
Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings cs.CL · 2026-04-30 · unverdicted · none · ref 15 · internal anchor
Modern text encoders resist second-order collapse under mean pooling because token embeddings concentrate tightly within texts, and this resistance correlates with stronger downstream performance.
Prism-Reranker: Beyond Relevance Scoring -- Jointly Producing Contributions and Evidence for Agentic Retrieval cs.IR · 2026-04-26 · accept · none · ref 33 · internal anchor
Prism-Reranker models output relevance, contribution statements, and evidence passages to support agentic retrieval beyond scalar scoring.
HaS: Accelerating RAG through Homology-Aware Speculative Retrieval cs.IR · 2026-04-22 · unverdicted · none · ref 31 · internal anchor
HaS accelerates RAG retrieval via homology-aware speculative retrieval and homologous query re-identification validation, cutting latency 24-37% with 1-2% accuracy drop on tested datasets.
Latent Abstraction for Retrieval-Augmented Generation cs.CL · 2026-04-20 · unverdicted · none · ref 38 · internal anchor
LAnR unifies retrieval-augmented generation inside a single LLM by deriving dense retrieval vectors from a [PRED] token's hidden states and using entropy to adaptively stop retrieval, outperforming prior RAG on six QA benchmarks with better efficiency.
OmniGCD: Abstracting Generalized Category Discovery for Modality Agnosticism cs.CV · 2026-04-16 · unverdicted · none · ref 57 · internal anchor
OmniGCD trains a Transformer once on synthetic data to enable zero-shot generalized category discovery across 16 datasets in four modalities without any dataset-specific fine-tuning.
DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math? cs.AI · 2026-04-10 · unverdicted · none · ref 9 · internal anchor
DRBENCHER generates multi-hop questions across biochemistry, finance, geophysics, security, and history that test interleaved browsing and computation, where the strongest models reach only 20% accuracy and human validation finds 76% validity.
Retrieval Augmented Conversational Recommendation with Reinforcement Learning cs.IR · 2026-04-06 · unverdicted · none · ref 58 · internal anchor
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
PLUME: Latent Reasoning Based Universal Multimodal Embedding cs.CV · 2026-04-02 · unverdicted · none · ref 38 · internal anchor
PLUME uses latent-state autoregressive rollouts and a progressive training curriculum to deliver efficient reasoning for universal multimodal embeddings without generating explicit rationales.
Spectral Tempering for Embedding Compression in Dense Passage Retrieval cs.IR · 2026-03-19 · unverdicted · none · ref 43 · internal anchor
Spectral Tempering derives an adaptive scaling factor γ(k) from the embedding eigenspectrum via local SNR analysis and knee-point normalization to achieve near-optimal compression without training or validation.
LMEB: Long-horizon Memory Embedding Benchmark cs.CL · 2026-03-13 · unverdicted · none · ref 33 · internal anchor
LMEB benchmark shows that embedding models' performance on traditional retrieval does not transfer to long-horizon memory tasks, larger models do not always perform better, and LMEB measures capabilities orthogonal to MTEB.
LSTM-MAS: A Long Short-Term Memory Inspired Multi-Agent System for Long-Context Understanding cs.CL · 2026-01-17 · unverdicted · none · ref 34 · internal anchor
LSTM-MAS uses a chained multi-agent architecture modeled on LSTM input, forget, and output gates to improve long-context QA performance and reduce hallucinations compared with prior multi-agent baselines.
Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG cs.CL · 2025-11-12 · conditional · none · ref 15 · internal anchor
TARG uses uncertainty scores from a short no-context draft to gate retrieval in RAG, matching Always-RAG accuracy while cutting retrievals by 70-90% on QA benchmarks.
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation cs.CL · 2025-10-09 · unverdicted · none · ref 23 · internal anchor
HiPRAG adds hierarchical process rewards to RL training for agentic RAG, reducing over-search to 2.3% and achieving 65.4-67.2% accuracy on seven QA benchmarks across 3B and 7B models.
Group-in-Group Policy Optimization for LLM Agent Training cs.LG · 2025-05-16 · unverdicted · none · ref 70 · internal anchor
GiGPO adds a hierarchical grouping mechanism to group-based RL so that LLM agents receive both global trajectory and local step-level credit signals, yielding >12% gains on ALFWorld and >9% on WebShop over GRPO while keeping the same rollout and memory footprint.
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs cs.CL · 2024-12-22 · unverdicted · none · ref 63 · internal anchor
GME achieves state-of-the-art results in universal multimodal retrieval by training on a balanced synthetic multimodal dataset.
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks cs.CV · 2024-10-07 · conditional · none · ref 32 · internal anchor
VLM2Vec converts state-of-the-art vision-language models into universal multimodal embedders via contrastive training on the new MMEB benchmark, delivering 10-20% absolute gains over prior models on both in-distribution and out-of-distribution tasks.
Vector Retrieval with Similarity and Diversity: How Hard Is It? cs.IR · 2024-07-05 · unverdicted · none · ref 17 · internal anchor
VRSD is defined by maximizing query-to-sum similarity, proven NP-complete, with a parameter-free heuristic outperforming MMR and DPP baselines.
M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation cs.CL · 2024-02-05 · unverdicted · none · ref 35 · internal anchor
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries cs.CL · 2024-01-27 · accept · none · ref 42 · internal anchor
MultiHop-RAG is a new benchmark dataset demonstrating that existing retrieval-augmented generation systems perform poorly on multi-hop queries requiring retrieval and reasoning over multiple evidence pieces.
C-Pack: Packed Resources For General Chinese Embeddings cs.CL · 2023-09-14 · accept · none · ref 58 · internal anchor
C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.
Field Order Should Not Matter: Permutation-Invariant Embedding Model Fine-Tuning for Structured Metadata Retrieval cs.CL · 2026-06-29 · unverdicted · none · ref 15 · internal anchor
Permutation-invariant fine-tuning (PI-FT) randomizes field order and applies dropout during embedding model training to eliminate sensitivity to serialization order, reducing order-change penalty from 7.4 to 0.2 nDCG@10 on a generated multilingual DevDataBench while outperforming zero-shot baselines
Semantic insurance pricing with large language models stat.AP · 2026-06-28 · unverdicted · none · ref 57 · internal anchor
LLM embeddings from policy text outperform hand-engineered features in a GLM for French motor insurance claim frequency, with larger gains at small sample sizes and further improvement from insurance-specific fine-tuning.
SHARD: cell-keyed residual splitting for alignment-resistant private dense retrieval cs.CR · 2026-06-26 · unverdicted · none · ref 2 · internal anchor
SHARD introduces cell-keyed residual splitting that turns dense retrieval embeddings into revocable, renewable, unlinkable templates resistant to alignment attacks while preserving exact utility under CKKS reranking.
Toward a Hybrid Digital Twin of Society: Quantifying Cognitive-Spatial Linkages Through Online-Offline Feedback Networks physics.soc-ph · 2026-06-25 · unverdicted · none · ref 12 · internal anchor
A Feedback Network model is developed showing online semantic exploration is more concentrated than physical mobility, with stable retail-business linkages and greater COVID disruption to spatial than cognitive routines, as a step toward hybrid digital twins of society.

Text Embeddings by Weakly-Supervised Contrastive Pre-training

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer