PromptEmbedder: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

Ching-Yu Tsai; Kuan-Yu Chen; Shou-De Lin; Yuan-Hao Chen; Yu-Che Tsai; Yu-Han Chang; Yu-Hsiang Chuang

read the original abstract

Large Language Models (LLMs) have demonstrated remarkable efficacy in text embedding, yet current adaptation methods like LoRA face significant bottlenecks in computational efficiency and cross-architecture transferability. Whenever a new backbone emerges, existing approaches require costly retraining from scratch. To address this, we propose PromptEmbedder, a novel dual-LLM framework that decouples embedding knowledge from specific backbone weights. PromptEmbedder utilizes a Prompting LLM to generate instruction-aware soft prompts for a frozen Embedding LLM via a differentiable generation process with continuous relaxation, ensuring full gradient flow during contrastive training. By localizing task-specific knowledge within the Prompting LLM, adapting to new architectures requires only retraining a lightweight linear alignment matrix. Evaluations on the MTEB benchmark show that PromptEmbedder achieves comparable performance with LoRA finetuning while reducing GPU memory by 40% and accelerating training by 3.7x. Our approach establishes a scalable, architecture-agnostic paradigm for efficient LLM-based representation learning.

PromptEmbedder: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

discussion (0)