Robust fine-tuning of zero-shot models

Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Y · 2022 · arXiv 2109.01903

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Editing Models with Task Arithmetic

cs.LG · 2022-12-08 · accept · novelty 8.0

Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

cs.CL · 2022-02-25 · accept · novelty 8.0

Randomly replacing labels in in-context demonstrations barely hurts performance, showing that label space, input distribution, and sequence format drive in-context learning more than ground-truth labels.

LAION-5B: An open large-scale dataset for training next generation image-text models

cs.CV · 2022-10-16 · accept · novelty 7.0

LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

cs.CV · 2022-04-01 · unverdicted · novelty 7.0

Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.

Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DG-Hard uses Donoho-Gavish hard thresholding on the fine-tuning weight delta to separate task-aligned signal from noise-like residual, recovering damaged capabilities while preserving target-task gains.

Generalization Boundaries of Fine-Tuned Small Language Models for Graph Structural Inference

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

Fine-tuned small language models (3-4B parameters) preserve ordinal consistency in ranking graph structural properties for graphs larger than training data and from held-out families, showing architecture-specific degradation.

NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

cs.CV · 2026-04-13 · unverdicted · novelty 4.0

The NTIRE 2026 challenge provides a dataset of over 294,000 real and AI-generated images with 36 transformations to benchmark robust detection models.

Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

cs.LG · 2026-04-01 · unverdicted · novelty 4.0

Matched learning-rate experiments show LoRA retains substantially higher zero-shot transfer (45% vs 11% on EuroSAT, 58% vs 9% on Pets) than Full FT in CLIP adaptation.

citing papers explorer

Showing 8 of 8 citing papers.

Editing Models with Task Arithmetic cs.LG · 2022-12-08 · accept · none · ref 107
Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? cs.CL · 2022-02-25 · accept · none · ref 158
Randomly replacing labels in in-context demonstrations barely hurts performance, showing that label space, input distribution, and sequence format drive in-context learning more than ground-truth labels.
LAION-5B: An open large-scale dataset for training next generation image-text models cs.CV · 2022-10-16 · accept · none · ref 87
LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language cs.CV · 2022-04-01 · unverdicted · none · ref 58
Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.
Spectral Unforgetting: Post-Hoc Recovery of Damaged Capabilities Without Retraining cs.LG · 2026-05-19 · unverdicted · none · ref 13
DG-Hard uses Donoho-Gavish hard thresholding on the fine-tuning weight delta to separate task-aligned signal from noise-like residual, recovering damaged capabilities while preserving target-task gains.
Generalization Boundaries of Fine-Tuned Small Language Models for Graph Structural Inference cs.LG · 2026-04-20 · unverdicted · none · ref 10
Fine-tuned small language models (3-4B parameters) preserve ordinal consistency in ranking graph structural properties for graphs larger than training data and from held-out families, showing architecture-specific degradation.
NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild cs.CV · 2026-04-13 · unverdicted · none · ref 82
The NTIRE 2026 challenge provides a dataset of over 294,000 real and AI-generated images with 36 transformations to benchmark robust detection models.
Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP cs.LG · 2026-04-01 · unverdicted · none · ref 15
Matched learning-rate experiments show LoRA retains substantially higher zero-shot transfer (45% vs 11% on EuroSAT, 58% vs 9% on Pets) than Full FT in CLIP adaptation.

Robust fine-tuning of zero-shot models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer