Neural Processes

Danilo J. Rezende; Dan Rosenbaum; Fabio Viola; Jonathan Schwarz; Marta Garnelo; S.M. Ali Eslami; Yee Whye Teh

arxiv: 1807.01622 · v1 · pith:BXW33AZKnew · submitted 2018-07-04 · 💻 cs.LG · stat.ML

Neural Processes

Marta Garnelo , Jonathan Schwarz , Dan Rosenbaum , Fabio Viola , Danilo J. Rezende , S.M. Ali Eslami , Yee Whye Teh This is my paper

classification 💻 cs.LG stat.ML

keywords neuraldataprobabilisticcomputationallyfunctionslikemodelsprocesses

0 comments

read the original abstract

A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision. A Gaussian process (GP), on the other hand, is a probabilistic model that defines a distribution over possible functions, and is updated in light of data via the rules of probabilistic inference. GPs are probabilistic, data-efficient and flexible, however they are also computationally intensive and thus limited in their applicability. We introduce a class of neural latent variable models which we call Neural Processes (NPs), combining the best of both worlds. Like GPs, NPs define distributions over functions, are capable of rapid adaptation to new observations, and can estimate the uncertainty in their predictions. Like NNs, NPs are computationally efficient during training and evaluation but also learn to adapt their priors to data. We demonstrate the performance of NPs on a range of learning tasks, including regression and optimisation, and compare and contrast with related models in the literature.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 15 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Gradient-Based Program Synthesis with Neurally Interpreted Languages
cs.LG 2026-04 unverdicted novelty 8.0

NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...
Personalized Multi-Interest Modeling for Cross-Domain Recommendation to Cold-Start Users
cs.IR 2026-04 unverdicted novelty 7.0

NF-NPCDR enhances neural processes with normalizing flows to model personalized multi-interest preferences and uses a preference pool plus adaptive decoder to improve cross-domain recommendations for cold-start users.
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
cs.LG 2026-04 unverdicted novelty 7.0

OptBias meta-learns optimization bias from Gaussian process synthetic tasks to improve surrogate performance for offline black-box optimization from small datasets.
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
cs.AI 2026-02 unverdicted novelty 7.0

GPS trains a small model on optimization history to predict prompt difficulty and select intermediate-difficulty diverse batches, yielding better training efficiency, final performance, and test-time allocation than b...
Robust Filter Attention: Self-Attention as Precision-Weighted State Estimation
cs.LG 2025-09 unverdicted novelty 7.0

Robust Filter Attention models self-attention as consistency-based state estimation under a linear SDE for token trajectories, matching standard attention complexity while showing lower perplexity and better zero-shot...
Transformer Neural Processes - Kernel Regression
cs.LG 2024-11 unverdicted novelty 7.0

TNP-KR adds a kernel regression transformer block, kernel attention bias, scan attention for translation invariance, and deep kernel attention to achieve lower complexity and state-of-the-art results on meta-regressio...
DeRegiME: Deep Regime Mixtures for Probabilistic Forecasting under Distribution Shift
cs.LG 2026-05 unverdicted novelty 6.0

DeRegiME uses a sparse variational GP with nonstationary regime-mixing kernel to decompose forecasts into mean, residual regimes, and noise for improved probabilistic forecasting under distribution shift.
Spectral Transformer Neural Processes
cs.LG 2026-05 unverdicted novelty 6.0

STNPs extend TNPs with a spectral aggregator that estimates context spectra, forms spectral mixtures, and injects task-adaptive frequency features to better handle periodicity.
Earth-o1: A Grid-free Observation-native Atmospheric World Model
cs.CV 2026-05 unverdicted novelty 6.0

Earth-o1 learns continuous atmospheric dynamics from ungridded observations and matches operational IFS forecast skill in hindcasts.
Learning to Theorize the World from Observation
cs.LG 2026-05 unverdicted novelty 6.0

NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
cs.LG 2026-04 unverdicted novelty 6.0

OptBias meta-learns reusable optimization bias from Gaussian process synthetic tasks to improve surrogate ranking performance on small offline black-box optimization datasets.
Neural Stochastic Processes for Satellite Precipitation Refinement
cs.CV 2026-04 unverdicted novelty 6.0

NSP model fuses satellite and gauge data with neural processes and SDEs, outperforming 13 baselines and JAXA's operational product on a new 43k-sample US benchmark across six metrics.
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks
cs.LG 2026-04 unverdicted novelty 5.0

OptBias meta-learns optimization bias via Gaussian process synthetic tasks to boost surrogate performance for small-data offline black-box optimization across benchmarks.
Exploring Temporal Representation in Neural Processes for Multimodal Action Prediction
cs.RO 2026-04 unverdicted novelty 5.0

A revised DMBN with positional time encoding improves temporal representation and generalization in neural processes for multimodal robotic action prediction.
Regime-Aware Conditional Neural Processes with Multi-Criteria Decision Support for Operational Electricity Price Forecasting
cs.LG 2025-07 unverdicted novelty 5.0

R-NP model uses DS-HDP-HMM regime detection plus per-regime CNPs to produce regime-weighted price forecasts that rank as the most balanced option under TOPSIS across 2021-2023 when tested in battery arbitrage and grid...