Generation Properties of Stochastic Interpolation under Finite Training Set

Shaohui Lin; Yunchen Li; Zhou Yu

arxiv: 2509.21925 · v3 · pith:3OMENC5Qnew · submitted 2025-09-26 · 💻 cs.LG · cs.AI

Generation Properties of Stochastic Interpolation under Finite Training Set

Yunchen Li , Shaohui Lin , Zhou Yu This is my paper

classification 💻 cs.LG cs.AI

keywords traininggenerativesamplesstochasticfinitegenerationprocessunder

0 comments

read the original abstract

This paper investigates the theoretical behavior of generative models under finite training populations. Within the stochastic interpolation generative framework, we derive closed-form expressions for the optimal velocity field and score function when only a finite number of training samples are available. We demonstrate that, under some regularity conditions, the deterministic generative process exactly recovers the training samples, while the stochastic generative process manifests as training samples with added Gaussian noise. Beyond the idealized setting, we consider model estimation errors and introduce formal definitions of underfitting and overfitting specific to generative models. Our theoretical analysis reveals that, in the presence of estimation errors, the stochastic generation process effectively produces convex combinations of training samples corrupted by a mixture of uniform and Gaussian noise. Experiments on generation tasks and downstream tasks such as classification support our theory.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On The Hidden Biases of Flow Matching Samplers
stat.ML 2025-12 unverdicted novelty 7.0

Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlli...