{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2019:IZHUHFCYHPHS2S6LMXZ53T65C7","short_pith_number":"pith:IZHUHFCY","schema_version":"1.0","canonical_sha256":"464f4394583bcf2d4bcb65f3ddcfdd17f17ca9e6a108204ca48924543df5f7e3","source":{"kind":"arxiv","id":"1907.05600","version":3},"attestation_state":"computed","paper":{"title":"Generative Modeling by Estimating Gradients of the Data Distribution","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A generative model learns gradients of noisy data distributions to drive annealed Langevin dynamics and produce samples without adversarial training.","cross_cats":["stat.ML"],"primary_cat":"cs.LG","authors_text":"Stefano Ermon, Yang Song","submitted_at":"2019-07-12T07:37:26Z","abstract_excerpt":"We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields of gradients of the perturbed data distribution for all noise levels. For sampling, we propose an annealed Langevin dynamics where we use gradients corresponding to gradually decreasing noise levels as"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"1907.05600","kind":"arxiv","version":3},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2019-07-12T07:37:26Z","cross_cats_sorted":["stat.ML"],"title_canon_sha256":"b75f07f5b1033e09b1209c28be091524212e58474763bb135b7ec8620e416b82","abstract_canon_sha256":"8b8a8ca576822bf40b200dc900d7ab59460fc585198d16880952c92d2eac2423"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:13.894500Z","signature_b64":"vYZir7X+glE6yUNCE1S7PuTZ/pBZ244mccQZofLbIwLu4zlrDUH8wd6ZryWPbcra5bWiOgfREH16Qnyj7j+qCA==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"464f4394583bcf2d4bcb65f3ddcfdd17f17ca9e6a108204ca48924543df5f7e3","last_reissued_at":"2026-05-17T23:38:13.893886Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:13.893886Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Generative Modeling by Estimating Gradients of the Data Distribution","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A generative model learns gradients of noisy data distributions to drive annealed Langevin dynamics and produce samples without adversarial training.","cross_cats":["stat.ML"],"primary_cat":"cs.LG","authors_text":"Stefano Ermon, Yang Song","submitted_at":"2019-07-12T07:37:26Z","abstract_excerpt":"We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields of gradients of the perturbed data distribution for all noise levels. For sampling, we propose an annealed Langevin dynamics where we use gradients corresponding to gradually decreasing noise levels as"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Our models produce samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.87 on CIFAR-10. Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Score-based generative modeling via multi-noise-level score matching and annealed Langevin dynamics produces samples on par with GANs and sets a new inception score record on CIFAR-10.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A generative model learns gradients of noisy data distributions to drive annealed Langevin dynamics and produce samples without adversarial training.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ba39214c102c2fa9d25b014968bd60b5d60bf3807bb9068b2abcefe578b38177"},"source":{"id":"1907.05600","kind":"arxiv","version":3},"verdict":{"id":"5edb9a9a-cf0a-4c43-a367-e3146d1cfd82","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-17T13:56:41.052084Z","strongest_claim":"Our models produce samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.87 on CIFAR-10. Additionally, we demonstrate that our models learn effective representations via image inpainting experiments.","one_line_summary":"Score-based generative modeling via multi-noise-level score matching and annealed Langevin dynamics produces samples on par with GANs and sets a new inception score record on CIFAR-10.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores.","pith_extraction_headline":"A generative model learns gradients of noisy data distributions to drive annealed Langevin dynamics and produce samples without adversarial training."},"references":{"count":66,"sample":[{"doi":"","year":2016,"title":"G. Alain, Y . Bengio, L. Yao, J. Yosinski, E. Thibodeau-Laufer, S. Zhang, and P. Vincent. GSNs: generative stochastic networks. Information and Inference, 2016","work_id":"257e3fc4-3552-42d6-8b29-fa05f0292c12","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In D. Precup and Y . W. Teh, editors,Proceedings of the 34th International Conference on Ma- chine Learning, volum","work_id":"0476ca03-7a2c-4410-9eb9-78e4b014fdb4","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2003,"title":"M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data represen- tation. Neural computation, 15(6):1373–1396, 2003","work_id":"0def5714-d589-46d8-83a4-281c81047140","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2013,"title":"Y . Bengio, L. Yao, G. Alain, and P. Vincent. Generalized denoising auto-encoders as generative models. In Advances in neural information processing systems, pages 899–907, 2013","work_id":"6ed59c5e-d966-4018-b108-2a701f7b1dfe","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"Learning to Generate Samples from Noise through Infusion Training","work_id":"dcfc7c7b-9a83-4c3a-9cc6-3bba57a917e1","ref_index":5,"cited_arxiv_id":"1703.06975","is_internal_anchor":true}],"resolved_work":66,"snapshot_sha256":"cd1d876188adea6a00b1580ffd92a9df06c81adadf09639b8f98cfe679df431d","internal_anchors":10},"formal_canon":{"evidence_count":2,"snapshot_sha256":"acbb7a964ecf6987e147ce17748e3845dadaf02af51890bba3ae8868e33e0d8e"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"1907.05600","created_at":"2026-05-17T23:38:13.893984+00:00"},{"alias_kind":"arxiv_version","alias_value":"1907.05600v3","created_at":"2026-05-17T23:38:13.893984+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.1907.05600","created_at":"2026-05-17T23:38:13.893984+00:00"},{"alias_kind":"pith_short_12","alias_value":"IZHUHFCYHPHS","created_at":"2026-05-18T12:33:18.533446+00:00"},{"alias_kind":"pith_short_16","alias_value":"IZHUHFCYHPHS2S6L","created_at":"2026-05-18T12:33:18.533446+00:00"},{"alias_kind":"pith_short_8","alias_value":"IZHUHFCY","created_at":"2026-05-18T12:33:18.533446+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":19,"internal_anchor_count":19,"sample":[{"citing_arxiv_id":"2509.20430","citing_title":"pop-cosmos: Star formation over 12 Gyr from generative modelling of a deep infrared-selected galaxy catalogue","ref_index":220,"is_internal_anchor":true},{"citing_arxiv_id":"2509.26258","citing_title":"EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules","ref_index":52,"is_internal_anchor":true},{"citing_arxiv_id":"2512.11077","citing_title":"A probabilistic framework for crystal structure denoising, phase classification, and order parameters","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2512.23748","citing_title":"A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ideal Data Scenarios","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2305.02463","citing_title":"Shap-E: Generating Conditional 3D Implicit Functions","ref_index":62,"is_internal_anchor":true},{"citing_arxiv_id":"2603.22564","citing_title":"MIOFlow 2.0: A unified framework for inferring cellular stochastic dynamics from single cell and spatial transcriptomics data","ref_index":54,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08580","citing_title":"Adjoint Matching through the Lens of the Stochastic Maximum Principle in Optimal Control","ref_index":9,"is_internal_anchor":true},{"citing_arxiv_id":"2604.02415","citing_title":"Generative models on phase space","ref_index":14,"is_internal_anchor":true},{"citing_arxiv_id":"2105.05233","citing_title":"Diffusion Models Beat GANs on Image Synthesis","ref_index":59,"is_internal_anchor":true},{"citing_arxiv_id":"2605.11266","citing_title":"PG-3DGS: Optimizing 3D Gaussian Splatting to Satisfy Physics Objectives","ref_index":29,"is_internal_anchor":true},{"citing_arxiv_id":"2604.06779","citing_title":"VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2502.05171","citing_title":"Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach","ref_index":146,"is_internal_anchor":true},{"citing_arxiv_id":"2605.02852","citing_title":"Inferring Active Neural Circuits Using Diffusion Scores","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2605.06134","citing_title":"Diffusion model for SU(N) gauge theories","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24416","citing_title":"Scaling Properties of Continuous Diffusion Spoken Language Models","ref_index":24,"is_internal_anchor":true},{"citing_arxiv_id":"2605.00229","citing_title":"A unified perspective on fine-tuning and sampling with diffusion and flow models","ref_index":155,"is_internal_anchor":true},{"citing_arxiv_id":"1910.03771","citing_title":"HuggingFace's Transformers: State-of-the-art Natural Language Processing","ref_index":137,"is_internal_anchor":true},{"citing_arxiv_id":"2604.10465","citing_title":"Rethinking the Diffusion Model from a Langevin Perspective","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2604.06779","citing_title":"VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion","ref_index":2,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7","json":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7.json","graph_json":"https://pith.science/api/pith-number/IZHUHFCYHPHS2S6LMXZ53T65C7/graph.json","events_json":"https://pith.science/api/pith-number/IZHUHFCYHPHS2S6LMXZ53T65C7/events.json","paper":"https://pith.science/paper/IZHUHFCY"},"agent_actions":{"view_html":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7","download_json":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7.json","view_paper":"https://pith.science/paper/IZHUHFCY","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=1907.05600&json=true","fetch_graph":"https://pith.science/api/pith-number/IZHUHFCYHPHS2S6LMXZ53T65C7/graph.json","fetch_events":"https://pith.science/api/pith-number/IZHUHFCYHPHS2S6LMXZ53T65C7/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7/action/timestamp_anchor","attest_storage":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7/action/storage_attestation","attest_author":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7/action/author_attestation","sign_citation":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7/action/citation_signature","submit_replication":"https://pith.science/pith/IZHUHFCYHPHS2S6LMXZ53T65C7/action/replication_record"}},"created_at":"2026-05-17T23:38:13.893984+00:00","updated_at":"2026-05-17T23:38:13.893984+00:00"}