hub Mixed citations

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Xiang Lisa Li, Percy Liang · 2021 · cs.CL · arXiv 2101.00190

Mixed citation behavior. Most common role is background (56%).

83 Pith papers citing it

Background 56% of classified citations

open full Pith review browse 83 citing papers arXiv PDF

abstract

Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix). Prefix-tuning draws inspiration from prompting, allowing subsequent tokens to attend to this prefix as if it were "virtual tokens". We apply prefix-tuning to GPT-2 for table-to-text generation and to BART for summarization. We find that by learning only 0.1\% of the parameters, prefix-tuning obtains comparable performance in the full data setting, outperforms fine-tuning in low-data settings, and extrapolates better to examples with topics unseen during training.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9 method 5 baseline 2 dataset 1 other 1

citation-polarity summary

background 10 use method 4 baseline 2 unclear 1 use dataset 1

claims ledger

abstract Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, we propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix). Prefix-tuning draws inspiration from prompting, allowing subsequent tokens to attend to this prefix as if it were "virtual tokens". We

co-cited works

representative citing papers

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

ART optimizes visual pixel inputs to frozen MLLMs to achieve LoRA-competitive accuracy on math and structured tool-use benchmarks without modifying computational graphs.

Parameter-Efficient Fine-Tuning with Learnable Rank

cs.CL · 2026-06-03 · unverdicted · novelty 7.0

LR-LoRA learns per-layer adapter ranks during training and reports outperforming fixed-rank LoRA and other PEFT baselines on language understanding and commonsense reasoning tasks.

PrAda: Few-Shot Visual Adaptation for Text-Prompted Segmentation

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

PrAda adapts text-prompted segmentation models in a few-shot setting by learning and fusing class-specific prototypes from fine-grained and high-level features, yielding significant gains on semantic, instance, and panoptic segmentation across five benchmarks.

Adaptive and Fine-grained Module-wise Expert Pruning for Efficient LoRA-MoE Fine-Tuning

cs.LG · 2026-04-29 · unverdicted · novelty 7.0

DMEP prunes experts module-by-module in LoRA-MoE and removes load balancing after pruning, cutting trainable parameters 35-43% and raising throughput ~10% while matching or exceeding uniform baselines on reasoning tasks.

A Hormone-inspired Emotion Layer for Transformer language models (HELT)

cs.NE · 2026-04-13 · unverdicted · novelty 7.0

HormoneT5 augments T5 with a hormone-inspired block that predicts six continuous emotion values and uses them to modulate responses, reporting over 85% per-hormone accuracy and human preference for emotional quality.

Graph Topology Information Enhanced Heterogeneous Graph Representation Learning

cs.LG · 2026-04-07 · unverdicted · novelty 7.0

ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.

Exploring Cross-Modal Flows for Few-Shot Learning

cs.CV · 2025-10-16 · unverdicted · novelty 7.0

FMA introduces flow matching for multi-step cross-modal feature alignment in few-shot learning, using fixed coupling, noise augmentation, and early-stopping to outperform one-step PEFT methods.

Multimodal Policy Internalization for Conversational Agents

cs.CL · 2025-10-10 · unverdicted · novelty 7.0

The paper defines the MPI task and proposes TriMPI, a three-stage training pipeline of continual pretraining, supervised finetuning, and policy-aware reinforcement learning that internalizes multimodal policies into model parameters for improved adherence without prompts at inference.

Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting

cs.CV · 2025-08-06 · unverdicted · novelty 7.0

The paper offers a comprehensive survey and proposes a new taxonomy for continual learning strategies in VLMs and MLLMs to combat catastrophic forgetting beyond traditional methods.

Learning Interactive Real-World Simulators

cs.AI · 2023-10-09 · conditional · novelty 7.0

UniSim learns a universal real-world simulator from orchestrated diverse datasets, enabling zero-shot deployment of policies trained purely in simulation.

Efficient Memory Management for Large Language Model Serving with PagedAttention

cs.LG · 2023-09-12 · conditional · novelty 7.0

PagedAttention achieves near-zero waste in LLM key-value cache memory and enables 2-4x higher serving throughput than prior systems.

Large Language Models as Optimizers

cs.LG · 2023-09-07 · unverdicted · novelty 7.0

Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-designed baselines.

Steering Language Models With Activation Engineering

cs.CL · 2023-08-20 · unverdicted · novelty 7.0

Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.

QLoRA: Efficient Finetuning of Quantized LLMs

cs.LG · 2023-05-23 · conditional · novelty 7.0

QLoRA finetunes 4-bit quantized LLMs via LoRA adapters to match full-precision performance while using far less memory, enabling 65B-scale training on single GPUs and producing Guanaco models near ChatGPT level.

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

cs.CV · 2023-03-28 · conditional · novelty 7.0

LLaMA-Adapter turns frozen LLaMA 7B into a capable instruction follower using only 1.2M new parameters and zero-init attention, matching Alpaca while extending to image-conditioned reasoning on ScienceQA and COCO.

Flamingo: a Visual Language Model for Few-Shot Learning

cs.CV · 2022-04-29 · unverdicted · novelty 7.0

Flamingo models reach new state-of-the-art few-shot results on image and video tasks by bridging frozen vision and language models with cross-attention layers trained on interleaved web-scale data.

LoRA: Low-Rank Adaptation of Large Language Models

cs.CL · 2021-06-17 · accept · novelty 7.0

Adapting large language models by training only a low-rank decomposition BA added to frozen weight matrices matches full fine-tuning while cutting trainable parameters by orders of magnitude and adding no inference latency.

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

cs.CL · 2021-04-18 · conditional · novelty 7.0

Presents the NATURAL INSTRUCTIONS meta-dataset and shows generative pre-trained language models achieve 19% better generalization to unseen tasks when using task instructions.

EPnG: Adaptive Expert Prune-and-Grow for Parameter-Efficient MoE Fine-tuning

cs.LG · 2026-07-02 · unverdicted · novelty 6.0

EPnG reallocates LoRA capacity in MoE models by pruning experts with low router gate probabilities and expanding high-importance ones via rank growth, outperforming standard LoRA and nearing full fine-tuning performance with 0.55-0.72% parameters updated.

MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data

cs.CL · 2026-06-15 · unverdicted · novelty 6.0

MindAlign decodes inner speech from fMRI via subject-specific neural-semantic alignment into a multimodal space followed by prompting of a frozen LM, outperforming baselines and generalizing across subjects.

Substrate Asymmetry in User-Side Memory: A Diagnostic Framework

cs.CL · 2026-06-10 · unverdicted · novelty 6.0

User memory in LLMs factors into three orthogonal axes where parametric adapters and retrieval show opposite strengths, with causal evidence from attention interventions and an alignment tax on RLHF models.

Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

SDBN introduces adversarial training to PEFT via two variants using character-level edits and LLM-generated perturbations, claiming improved robustness and generalization on NLP benchmarks in low-resource noisy settings.

5% > 100%: Flatness Preference is All You Need for Multimodal Parameter-Efficient Fine-Tuning

cs.CV · 2026-06-09 · unverdicted · novelty 6.0

Flatness Preference Optimization (FlatPO) improves multimodal PEFT generalization by flattening a small set of sharp dimensions that dominate performance.

End-to-End Context Compression at Scale

cs.CL · 2026-06-08 · unverdicted · novelty 6.0

LCLMs are scaled 0.6B-encoder 4B-decoder compressors pre-trained on over 350B tokens that improve the Pareto frontier for general-task performance, compression speed, and peak memory in long-context language model inference.

citing papers explorer

Showing 24 of 24 citing papers after filters.

Parameter-Efficient Fine-Tuning with Learnable Rank cs.CL · 2026-06-03 · unverdicted · none · ref 38 · internal anchor
LR-LoRA learns per-layer adapter ranks during training and reports outperforming fixed-rank LoRA and other PEFT baselines on language understanding and commonsense reasoning tasks.
Multimodal Policy Internalization for Conversational Agents cs.CL · 2025-10-10 · unverdicted · none · ref 1 · internal anchor
The paper defines the MPI task and proposes TriMPI, a three-stage training pipeline of continual pretraining, supervised finetuning, and policy-aware reinforcement learning that internalizes multimodal policies into model parameters for improved adherence without prompts at inference.
Steering Language Models With Activation Engineering cs.CL · 2023-08-20 · unverdicted · none · ref 97 · internal anchor
Activation Addition steers language models by adding contrastive activation vectors from prompt pairs to control high-level properties like sentiment and toxicity at inference time without training.
LoRA: Low-Rank Adaptation of Large Language Models cs.CL · 2021-06-17 · accept · none · ref 29 · internal anchor
Adapting large language models by training only a low-rank decomposition BA added to frozen weight matrices matches full fine-tuning while cutting trainable parameters by orders of magnitude and adding no inference latency.
Cross-Task Generalization via Natural Language Crowdsourcing Instructions cs.CL · 2021-04-18 · conditional · none · ref 23 · internal anchor
Presents the NATURAL INSTRUCTIONS meta-dataset and shows generative pre-trained language models achieve 19% better generalization to unseen tasks when using task instructions.
MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data cs.CL · 2026-06-15 · unverdicted · none · ref 39 · internal anchor
MindAlign decodes inner speech from fMRI via subject-specific neural-semantic alignment into a multimodal space followed by prompting of a frozen LM, outperforming baselines and generalizing across subjects.
Substrate Asymmetry in User-Side Memory: A Diagnostic Framework cs.CL · 2026-06-10 · unverdicted · none · ref 54 · internal anchor
User memory in LLMs factors into three orthogonal axes where parametric adapters and retrieval show opposite strengths, with causal evidence from attention interventions and an alignment tax on RLHF models.
Small Data, Big Noise: Adversarial Training for Robust Parameter-Efficient Fine-Tuning cs.CL · 2026-06-09 · unverdicted · none · ref 22 · internal anchor
SDBN introduces adversarial training to PEFT via two variants using character-level edits and LLM-generated perturbations, claiming improved robustness and generalization on NLP benchmarks in low-resource noisy settings.
End-to-End Context Compression at Scale cs.CL · 2026-06-08 · unverdicted · none · ref 46 · internal anchor
LCLMs are scaled 0.6B-encoder 4B-decoder compressors pre-trained on over 350B tokens that improve the Pareto frontier for general-task performance, compression speed, and peak memory in long-context language model inference.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts cs.CL · 2026-05-13 · unverdicted · none · ref 56 · internal anchor
PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
XPERT: Expert Knowledge Transfer for Effective Training of Language Models cs.CL · 2026-05-09 · unverdicted · none · ref 76 · internal anchor
XPERT extracts and reuses cross-domain expert knowledge from pre-trained MoE LLMs via inference analysis and tensor decomposition to improve performance and convergence in downstream language model training.
TLoRA: Task-aware Low Rank Adaptation of Large Language Models cs.CL · 2026-04-20 · unverdicted · none · ref 27 · internal anchor
TLoRA jointly optimizes LoRA initialization via task-data SVD and sensitivity-driven rank allocation, delivering stronger results than standard LoRA across NLU, reasoning, math, code, and chat tasks while using fewer trainable parameters.
Transforming External Knowledge into Triplets for Enhanced Retrieval in RAG of LLMs cs.CL · 2026-04-14 · unverdicted · none · ref 44 · internal anchor
Tri-RAG turns external knowledge into Condition-Proof-Conclusion triplets and retrieves via the Condition anchor to improve efficiency and quality in LLM RAG.
PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention cs.CL · 2025-06-16 · unverdicted · none · ref 16 · internal anchor
PrefixMemory-Tuning decouples the prefix from attention to overcome performance limits of traditional prefix-tuning and reaches competitive results with modern PEFT methods on LLM adaptation benchmarks.
Towards Expert-Level Medical Question Answering with Large Language Models cs.CL · 2023-05-16 · unverdicted · none · ref 68 · internal anchor
Med-PaLM 2 achieves 86.5% accuracy on MedQA and approaches or exceeds prior state-of-the-art on other medical QA benchmarks while receiving higher physician preference ratings than human answers on consumer questions.
ST-MoE: Designing Stable and Transferable Sparse Expert Models cs.CL · 2022-02-17 · unverdicted · none · ref 171 · internal anchor
ST-MoE introduces stability techniques for sparse expert models, allowing a 269B-parameter model to achieve state-of-the-art transfer learning results across reasoning, summarization, and QA tasks at the compute cost of a 32B dense model.
Beyond Isolated Behaviors: Hierarchical User Modeling for LLM Personalization cs.CL · 2026-06-01 · unverdicted · none · ref 29 · internal anchor
PHF applies Bourdieu's Theory of Practice to create hierarchical user models for LLM personalization and reports consistent gains on the LaMP benchmark.
Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting cs.CL · 2026-05-28 · unverdicted · none · ref 34 · internal anchor
A plug-and-play KL regularizer that masks the target token and renormalizes probabilities to improve the learning-forgetting trade-off in LoRA adaptation of LLMs.
DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation cs.CL · 2026-05-15 · unverdicted · none · ref 29 · internal anchor
DebiasRAG uses a three-stage RAG process to generate and rerank query-specific debiasing contexts that act as fairness constraints for LLM outputs.
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models cs.CL · 2026-01-20 · unverdicted · none · ref 176 · internal anchor
The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.
RELOOP: Recursive Retrieval with Multi-Hop Reasoner and Planners for Heterogeneous QA cs.CL · 2025-10-23 · unverdicted · none · ref 18 · internal anchor
RELOOP unifies retrieval across text, tables, and KGs via hierarchical sequences and dual-agent guided iteration, reporting EM/F1 gains over baselines on HotpotQA, HybridQA/TAT-QA, and MetaQA.
A Survey on Large Language Models for Code Generation cs.CL · 2024-06-01 · unverdicted · none · ref 152 · internal anchor
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.
Small Language Models (SLMs) Can Still Pack a Punch: A survey (updated 2026) cs.CL · 2025-01-03 · unverdicted · none · ref 69 · internal anchor
A literature survey of Small Language Models (1-8B parameters) that can perform comparably or better than larger models, covering general-purpose and task-specific approaches plus creation techniques.
A Comprehensive Overview of Large Language Models cs.CL · 2023-07-12 · unverdicted · none · ref 41 · internal anchor
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.

Prefix-Tuning: Optimizing Continuous Prompts for Generation

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer