{"paper":{"title":"On The Hidden Biases of Flow Matching Samplers","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Replacing the target distribution with finite-sample surrogates in flow matching introduces three coupled biases that alter learned paths and dynamics.","cross_cats":["cs.LG","math.PR"],"primary_cat":"stat.ML","authors_text":"Soon Hoe Lim","submitted_at":"2025-12-18T17:02:11Z","abstract_excerpt":"Flow matching (FM) constructs continuous-time ODE samplers by prescribing probability paths between a base distribution and a target distribution. In this note, we study FM through the lens of finite-sample plug-in estimation. In addition to replacing population expectations by sample averages, one may replace the target distribution itself by a finite-sample surrogate, ranging from the empirical measure to a smoothed estimator. This viewpoint yields a natural hierarchy of empirical FM models. For affine conditional flows, we derive the exact empirical minimizer and identify a smoothed plug-in"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"For affine conditional flows, the exact empirical minimizer is derived and a smoothed plug-in regime yields a terminal law that is exactly a kernel-mixture estimator; fixed empirical marginal paths admit explicit flux-null corrections to the dynamics.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The derivations rely on the assumption that conditional flows are affine and that the plug-in hierarchy (empirical measure to smoothed estimators) is the appropriate finite-sample surrogate for the target distribution.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlling kinetic energy tails.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Replacing the target distribution with finite-sample surrogates in flow matching introduces three coupled biases that alter learned paths and dynamics.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"2ab19ae71eca591c2dc7ca934b7efba8250ce2a252fee759024bf802c05c6b5a"},"source":{"id":"2512.16768","kind":"arxiv","version":3},"verdict":{"id":"49bf0e41-d169-4d8f-8d0a-21b5b0d6da5d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T21:07:49.314302Z","strongest_claim":"For affine conditional flows, the exact empirical minimizer is derived and a smoothed plug-in regime yields a terminal law that is exactly a kernel-mixture estimator; fixed empirical marginal paths admit explicit flux-null corrections to the dynamics.","one_line_summary":"Empirical flow matching introduces coupled biases from plug-in estimation, including altered statistical targets, non-gradient minimizers, and non-unique dynamics via flux-null fields, with base distribution controlling kinetic energy tails.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The derivations rely on the assumption that conditional flows are affine and that the plug-in hierarchy (empirical measure to smoothed estimators) is the appropriate finite-sample surrogate for the target distribution.","pith_extraction_headline":"Replacing the target distribution with finite-sample surrogates in flow matching introduces three coupled biases that alter learned paths and dynamics."},"references":{"count":51,"sample":[{"doi":"","year":2023,"title":"Stochastic Interpolants: A Unifying Framework for Flows and Diffusions","work_id":"c2c7dd8f-fbfb-4591-89ec-9a3a0e6744bd","ref_index":1,"cited_arxiv_id":"2303.08797","is_internal_anchor":true},{"doi":"","year":2024,"title":"Learning to sample better.Journal of Sta- tistical Mechanics: Theory and Experiment, 2024(10):104014, 2024","work_id":"3a28ec8d-e876-4551-9765-a963959c2595","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2005,"title":"Luigi Ambrosio, Nicola Gigli, and Giuseppe Savar´ e.Gradient Flows: In Metric Spaces And In the Space of Probability Measures. Springer, 2005","work_id":"ff0367eb-04df-49e7-b46f-98256146af93","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Bronstein, Pierre Vandergheynst, and Adam Gosztolai","work_id":"ee05999c-81fb-4b6a-9c95-17162e808ef0","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Memorization and regularization in generative diffusion models.arXiv preprint arXiv:2501.15785","work_id":"75eadb61-366e-4504-8594-81b162cdb5d7","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":51,"snapshot_sha256":"d44f3c32292c883a0972487576f65f6588fcc1fedd24357f167ca6a673c244b8","internal_anchors":7},"formal_canon":{"evidence_count":2,"snapshot_sha256":"2171e4cae468b9d0a93c8dfc9a7073102242e4bcb15dac490cecacce1ab323ab"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}