LoRA-Key creates a standalone user-specific Watermark LoRA trained with a latent watermark prior and GOP, attachable via training-free superposition to protect LoRA ownership while preserving quality.
hub Canonical reference
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Canonical reference. 100% of citing Pith papers cite this work as background.
abstract
Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often consisting of billions of parameters, require vast amounts of computational resources for execution. Especially, the expansive scale and computational demands pose considerable challenges when customizing them for particular downstream tasks, particularly over the hardware platforms constrained by computational capabilities. Parameter Efficient Fine-Tuning (PEFT) provides a practical solution by efficiently adjusting the large models over the various downstream tasks. In particular, PEFT refers to the process of adjusting the parameters of a pre-trained large model to adapt it to a specific task or domain while minimizing the number of additional parameters introduced or computational resources required. This approach is particularly important when dealing with large-scale language models with high parameter counts, as fine-tuning these models from scratch can be computationally expensive and resource-intensive, posing considerable challenges in the supporting system platform design. In this survey, we present comprehensive studies of various PEFT algorithms, examining their performance and computational overhead. Moreover, we provide an overview of applications developed using different PEFT algorithms and discuss common techniques employed to mitigate computation costs for PEFT. In addition to providing an extensive survey from an algorithmic standpoint, we also examine various real-world system designs to investigate the implementation costs associated with different PEFT approaches. This survey serves as a valuable resource for researchers aiming to understand both the PEFT algorithm and its system implementation, offering detailed ......
hub tools
citation-role summary
citation-polarity summary
polarities
background 12representative citing papers
LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.
TC-UMIA is a population-level attack using pre- and post-unlearning predictions to infer membership across forget, retain, and unseen sets, revealing added privacy leakage to retained data.
A method using predicted rectification difficulty for optimal human sample allocation in LLM-augmented surveys captures 61-79% of theoretical efficiency gains and reduces MSE by 11% on two datasets without pilot data.
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
Supervised clinical section segmentation models perform strongly in-domain on MIMIC-III but degrade substantially out-of-domain on a new obstetrics dataset, whereas zero-shot LLMs show robust cross-domain performance after hallucination correction.
InstructMoLE replaces per-token routing with instruction-guided global routing for mixture-of-low-rank-experts in diffusion transformers and adds an output-space orthogonality loss to improve multi-conditional image generation.
LLM embeddings condition a generative transformer to enable faster convergence, better performance, and generalization to unseen LHC processes using a single model.
STAMP uses a curated 1.8M-pair spatial transcriptomics atlas and pathway-informed alignment to augment pathology foundation models for molecular phenotype inference from H&E WSIs.
FedSmoothLoRA improves federated LoRA fine-tuning by constructing local initializations from a round-matching matrix for cross-round continuity and a gradient-aligned matrix for client-specific guidance, yielding faster convergence than prior methods in image and text tasks.
HERA is a select-regularize-calibrate framework adapting frozen vision foundation models for cross-domain few-shot semantic segmentation via hierarchical layer selection with ETR, prior-guided regularization, and pixelwise adaptive calibration, reporting over 4.1 mIoU gains.
Federated PEFT on LLMs across healthcare and finance datasets performs close to centralized training and beats isolated local training under non-IID conditions.
Discriminative factorization distinguishes high-quality query sets for black-box model classification, with chance-level error decaying exponentially in query budget and parameters predicting empirical decay rates on auditing tasks.
Pretraining induces stable leading singular vectors that form a reusable spectral basis inherited by downstream tasks, enabling competitive performance with 0.2% trainable parameters on GLUE.
This work provides the first systematic study of transferring direct-coded spiking neural networks to event-based representations while aiming to preserve accuracy and reduce energy use.
DKPS-based methods predict new model benchmark scores using cached responses, matching baseline mean absolute error with substantially fewer queries and an offline query selection approach.
Constant-context skill learning trains reusable task-family modules for LLM agents using a deterministic state block for progress tracking and subgoal rewards, achieving 89.6% unseen success on ALFWorld, 76.8% on WebShop, and 66.4% on SciWorld with Qwen3-8B while reducing prompt tokens 2-7x.
NeWTral is a non-linear weight translation framework using MoE routing that reduces average attack success rate from 70% to 13% on unsafe domain adapters across Llama, Mistral, Qwen, and Gemma models up to 72B while retaining 90% knowledge fidelity.
TLoRA jointly optimizes LoRA initialization via task-data SVD and sensitivity-driven rank allocation, delivering stronger results than standard LoRA across NLU, reasoning, math, code, and chat tasks while using fewer trainable parameters.
BioTrain enables full-network fine-tuning of biosignal AI models on edge MCUs with sub-MB memory and sub-50mW power, delivering up to 35% accuracy gains and 8.1x memory reduction.
MP-ISMoE uses Gaussian noise perturbed iterative quantization and interactive side mixture-of-experts to deliver higher accuracy than prior memory-efficient transfer learning methods while keeping similar parameter and memory usage.
ORPO is most effective at misaligning LLMs while DPO excels at realigning them, though it reduces utility, revealing an asymmetry between attack and defense methods.
UATTA adapts pre-trained text-image models at test time without labels by using disagreement in bidirectional retrieval rankings to estimate and mitigate uncertainty for improved person search.
Succinct Model Difference Proofs certify that a neural-network update stays inside a policy-defined drift class using zero-knowledge proofs whose cost depends only on the drift structure.
citing papers explorer
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.