pith. sign in

hub

Model-agnostic meta-learning for fast adaptation of deep networks

24 Pith papers cite this work. Polarity classification is still indexing.

24 Pith papers citing it
abstract

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.

hub tools

citation-role summary

background 2

citation-polarity summary

roles

background 2

polarities

background 2

representative citing papers

Language Models are Few-Shot Learners

cs.CL · 2020-05-28 · accept · novelty 8.0

GPT-3 shows that scaling an autoregressive language model to 175 billion parameters enables strong few-shot performance across diverse NLP tasks via in-context prompting without fine-tuning.

CoRMA: Contrastive RMA for Contact-Rich Meta-Adaptation

cs.RO · 2026-05-21 · unverdicted · novelty 7.0

CoRMA enables within-episode adaptation for contact-rich robotic assembly by inferring semantic contact context with a causal Transformer and force-regime contrastive objective, retaining higher real success than FORGE baselines under target-pose noise.

Solving Rubik's Cube with a Robot Hand

cs.LG · 2019-10-16 · accept · novelty 7.0

Reinforcement learning models trained only in simulation using automatic domain randomization solve Rubik's cube with a real robot hand.

Language Models as Knowledge Bases?

cs.CL · 2019-09-03 · accept · novelty 7.0

BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.

Searching for Activation Functions

cs.NE · 2017-10-16 · conditional · novelty 7.0

Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.

Scaling Laws for Transfer

cs.LG · 2021-02-02 · unverdicted · novelty 6.0

Effective data transferred from pre-training to fine-tuning is described by a power law in model parameter count and fine-tuning dataset size, acting like a multiplier on the fine-tuning data.

Evolvability ES: Scalable and Direct Optimization of Evolvability

cs.NE · 2019-07-13 · unverdicted · novelty 6.0

Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.

Video Action Recognition Via Neural Architecture Searching

cs.CV · 2019-07-10 · unverdicted · novelty 6.0

Uses differentiable NAS with temporal segments and pseudo-3D operators to discover a video action recognition network that outperforms hand-designed models on UCF101 with ~1% of the parameters when trained from scratch.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.

Language Models (Mostly) Know What They Know

cs.CL · 2022-07-11 · unverdicted · novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

citing papers explorer

Showing 24 of 24 citing papers.

  • Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution cs.CL · 2023-09-28 · unverdicted · none · ref 110 · internal anchor

    Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.

  • Language Models are Few-Shot Learners cs.CL · 2020-05-28 · accept · none · ref 14

    GPT-3 shows that scaling an autoregressive language model to 175 billion parameters enables strong few-shot performance across diverse NLP tasks via in-context prompting without fine-tuning.

  • CoRMA: Contrastive RMA for Contact-Rich Meta-Adaptation cs.RO · 2026-05-21 · unverdicted · none · ref 18 · internal anchor

    CoRMA enables within-episode adaptation for contact-rich robotic assembly by inferring semantic contact context with a causal Transformer and force-regime contrastive objective, retaining higher real success than FORGE baselines under target-pose noise.

  • Learned Memory Attenuation in Sage-Husa Kalman Filters for Robust UAV State Estimation eess.SP · 2026-05-18 · unverdicted · none · ref 37 · internal anchor

    NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.

  • JanusPipe: Efficient Pipeline Parallel Training for Machine Learning Interatomic Potentials cs.DC · 2026-05-18 · unverdicted · none · ref 42 · 2 links · internal anchor

    JanusPipe introduces SymFold and WaveK to enable efficient 3D-parallel training for conservative MLIPs, reporting 1.51x and 1.45x average throughput gains over 1F1B and Hanayo baselines on 32 GPUs.

  • Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning cs.LG · 2026-05-15 · unverdicted · none · ref 15 · internal anchor

    Creativity is defined as meta-learning where a frozen diffusion creator optimizes candidates for rapid improvement by an adapting appraiser such as an autoencoder or CLIP adapter.

  • Solving Rubik's Cube with a Robot Hand cs.LG · 2019-10-16 · accept · none · ref 33 · internal anchor

    Reinforcement learning models trained only in simulation using automatic domain randomization solve Rubik's cube with a real robot hand.

  • Language Models as Knowledge Bases? cs.CL · 2019-09-03 · accept · none · ref 15 · internal anchor

    BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.

  • Learning to learn with quantum neural networks via classical neural networks quant-ph · 2019-07-11 · unverdicted · none · ref 44 · internal anchor

    Classical RNNs trained on small instances provide parameter initializations for QAOA and VQE that reduce total optimization iterations and generalize across problem sizes.

  • Searching for Activation Functions cs.NE · 2017-10-16 · conditional · none · ref 7

    Automated search discovers Swish activation f(x) = x * sigmoid(βx) that improves top-1 ImageNet accuracy over ReLU by 0.9% on Mobile NASNet-A and 0.6% on Inception-ResNet-v2.

  • Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 199 · internal anchor

    SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

  • Scaling Laws for Transfer cs.LG · 2021-02-02 · unverdicted · none · ref 170 · internal anchor

    Effective data transferred from pre-training to fine-tuning is described by a power law in model parameter count and fine-tuning dataset size, acting like a multiplier on the fine-tuning data.

  • Compressive Transformers for Long-Range Sequence Modelling cs.LG · 2019-11-13 · unverdicted · none · ref 92 · internal anchor

    Compressive Transformer sets new records on WikiText-103 (17.1 ppl) and Enwik8 (0.97 bpc) via memory compression and introduces the PG-19 long-range language benchmark.

  • Evolvability ES: Scalable and Direct Optimization of Evolvability cs.NE · 2019-07-13 · unverdicted · none · ref 11 · internal anchor

    Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.

  • Video Action Recognition Via Neural Architecture Searching cs.CV · 2019-07-10 · unverdicted · none · ref 23 · internal anchor

    Uses differentiable NAS with temporal segments and pseudo-3D operators to discover a video action recognition network that outperforms hand-designed models on UCF101 with ~1% of the parameters when trained from scratch.

  • Learning to Theorize the World from Observation cs.LG · 2026-05-05 · unverdicted · none · ref 136

    NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.

  • A Meta Reinforcement Learning Approach to Goals-Based Wealth Management cs.LG · 2026-05-04 · unverdicted · none · ref 84

    MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.

  • Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware quant-ph · 2026-04-27 · unverdicted · none · ref 8

    A residual neural network trained on one quantum device's noise data can be fine-tuned with 20 samples from a second device to improve prediction of ideal circuit outputs, recovering 34.9% of the performance gap.

  • Language Models (Mostly) Know What They Know cs.CL · 2022-07-11 · unverdicted · none · ref 93

    Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

  • A General Language Assistant as a Laboratory for Alignment cs.CL · 2021-12-01 · conditional · none · ref 38

    Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.

  • MANGO: Meta-Adaptive Network Gradient Optimization for Online Continual Learning cs.LG · 2026-05-18 · unverdicted · none · ref 6 · internal anchor

    MANGO combines gradient-gating and meta-learned regularization to balance stability and plasticity in single-pass online continual learning, reporting state-of-the-art accuracy on CLEAR-10, CIFAR-100, and Tiny-ImageNet.

  • Counting voids and filaments: Betti Curves as a Powerful Probe for Cosmology astro-ph.CO · 2025-12-08 · unverdicted · none · ref 82 · internal anchor

    Betti curves from persistent homology of large-scale structure provide complementary cosmological constraints on ns, sigma8, and Om, with tighter bounds when analyzed jointly with the power spectrum.

  • LiLAW: Lightweight Learnable Adaptive Weighting to Learn Sample Difficulty & Improve Noisy Training cs.LG · 2025-09-25 · unverdicted · none · ref 6 · internal anchor

    LiLAW learns to weight samples as easy, moderate or hard using three global scalars updated by one gradient step on a validation batch to improve noisy training performance.

  • Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models cs.CL · 2026-04-22 · unverdicted · none · ref 34

    A 3B model with few-shot prompting reaches 79.7% of GPT-5 tool-use performance while a hypernetwork adaptation adds zero measurable benefit across four benchmarks.