pith. sign in

arxiv: 2508.06614 · v2 · submitted 2025-08-08 · 💻 cs.LG · cond-mat.stat-mech· quant-ph

Local Diffusion Models and Phases of Data Distributions

Pith reviewed 2026-05-18 23:30 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.stat-mechquant-ph
keywords diffusion modelslocal denoisersphase transitionsdata distribution phasesspatial Markovianityscore functionsgenerative modelsefficient architectures
0
0 comments X

The pith

The reverse denoising process splits into an early trivial phase and a late data phase separated by a rapid transition where local denoisers must fail.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines phases of data distributions by whether distributions can be connected through spatially local operations along the diffusion path. This definition shows that the reverse process begins in a trivial phase, ends in a data phase, and crosses a narrow interval of rapid change where any local denoiser necessarily fails. A reader would care because the result indicates that global score computations are required only inside that narrow interval, so local neural networks can handle most of the denoising process. The work links local-denoiser success to spatial Markovianity and uses this link as an operational test for the transition points, confirming the pattern on real datasets.

Core claim

We define two distributions as belonging to the same data distribution phase if they can be mutually connected via spatially local operations such as local denoisers, along the same evolution path as the diffusion. We demonstrate that the reverse denoising process consists of an early trivial phase and a late data phase, sandwiching a rapid phase transition where local denoisers must fail. We further demonstrate that the performance of local denoisers is closely tied to spatial Markovianity, which provides an operational criterion for diagnosing such phase transitions.

What carries the argument

Phases of data distributions, defined as equivalence classes under spatially local operations along the diffusion path, which locate the narrow interval where local denoisers fail.

If this is right

  • Far from the phase transition point, small local neural networks can compute the score function.
  • Global neural networks are needed only inside the narrow time window around each phase transition.
  • Spatial Markovianity supplies a practical test for locating those transition times.
  • Diffusion architectures can therefore use local networks for most timesteps and global networks only near the transitions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same phase definition could be tested on other iterative generative processes to see whether similar transitions appear.
  • Architectures that switch between local and global layers at detected transition times could be built and benchmarked for speed gains.
  • Repeating the Markovianity diagnostic across many datasets would show whether the phase structure is common to high-dimensional data.
  • The framework may connect to other non-equilibrium analyses of generative models and suggest new ways to measure locality requirements.

Load-bearing premise

Two distributions belong to the same phase precisely when they can be connected by spatially local operations along the diffusion path.

What would settle it

An experiment in which a local denoiser maintains high performance through the entire reverse process on a standard image dataset, with no detectable drop at the predicted transition time, would contradict the existence of a rapid phase where local denoisers must fail.

Figures

Figures reproduced from arXiv: 2508.06614 by Fangjun Hu, Guangkuo Liu, Xun Gao, Yifan F. Zhang.

Figure 1
Figure 1. Figure 1: FIG. 1. Schematic of diffusion models and phases of data distribu [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Schematics of designing local denoisers. For time step [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. (a) CMI [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. 64 samples of denoised images, with local denoisers ( [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

As a class of generative artificial intelligence frameworks inspired by statistical physics, diffusion models have shown extraordinary performance in synthesizing complicated data distributions through a denoising process gradually guided by score functions. Real-life data, like images, is often spatially structured in low-dimensional spaces. However, ordinary diffusion models ignore this local structure and learn spatially global score functions, which are often computationally expensive. In this work, motivated by recent advances in non-equilibrium statistical physics, we develop a generic framework for defining phases of data distributions and use it to analyze the locality requirements of denoisers in diffusion models. We define two distributions as belonging to the same data distribution phase if they can be mutually connected via spatially local operations such as local denoisers, along the same evolution path as the diffusion. We demonstrate that the reverse denoising process consists of an early trivial phase and a late data phase, sandwiching a rapid phase transition where local denoisers must fail. We further demonstrate that the performance of local denoisers is closely tied to spatial Markovianity, which provides an operational criterion for diagnosing such phase transitions. We validate this criterion through numerical experiments on real-world datasets. Our work suggests guidance for simpler and more efficient architectures of diffusion models: far from the phase transition point, we can use small local neural networks to compute the score function; global neural networks are only necessary around the narrow time interval of phase transitions. This result also opens up new directions for studying phases of data distributions, the broader science of generative artificial intelligence, and guiding the design of neural networks inspired by physics concepts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a framework for phases of data distributions in diffusion models, defining two distributions as belonging to the same phase if they can be connected via spatially local operations along the diffusion path. It claims the reverse denoising process consists of an early trivial phase and a late data phase separated by a rapid transition at which local denoisers must fail, links this to spatial Markovianity as an operational diagnostic, validates the criterion numerically on real datasets, and concludes that small local networks suffice far from the transition while global networks are needed only in a narrow interval.

Significance. If the necessity claim for local-denoiser failure is shown to be independent of the phase definition and the Markovianity criterion is rigorously tied to it, the work could guide computationally lighter diffusion architectures for spatially structured data such as images. It also offers a physics-motivated lens for analyzing generative processes and may stimulate further study of phases in high-dimensional distributions.

major comments (2)
  1. [Abstract / phase definition] Abstract and the phase-definition paragraph: the statement that 'local denoisers must fail' at the transition follows directly from the adopted definition (two distributions are in the same phase precisely when they are connected by spatially local operations). The manuscript must therefore derive, rather than assume, that spatial Markovianity is equivalent to (or a strict upper bound on) the non-existence of any local denoiser, not merely correlated with the specific local architectures tested.
  2. [Numerical experiments] Numerical-validation section: the abstract states that the Markovianity criterion is validated on real-world datasets, yet supplies no quantitative details on how the transition point is located, what controls or baselines are used, or the observed effect sizes. Without these, it is unclear whether the rapid transition is observed independently of the definitional framework.
minor comments (2)
  1. [Definition of phases] Clarify the precise mathematical meaning of 'spatially local operations' and 'along the same evolution path' when the definition is first introduced.
  2. [Introduction] Add a short discussion of related prior work on local score estimation or physics-inspired restrictions on diffusion networks.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating where revisions have been made to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Abstract / phase definition] Abstract and the phase-definition paragraph: the statement that 'local denoisers must fail' at the transition follows directly from the adopted definition (two distributions are in the same phase precisely when they are connected by spatially local operations). The manuscript must therefore derive, rather than assume, that spatial Markovianity is equivalent to (or a strict upper bound on) the non-existence of any local denoiser, not merely correlated with the specific local architectures tested.

    Authors: We agree that the claim of local denoiser failure at the transition is a direct logical consequence of the phase definition, as two distributions belong to different phases precisely when no sequence of spatially local operations connects them along the diffusion path; this is derived from the definition rather than assumed. In the revised manuscript we have made this implication explicit in the abstract and the phase-definition section. On the link to spatial Markovianity, the original text presents it as an operational diagnostic supported by theoretical arguments connecting the loss of locality in the score function to the emergence of long-range correlations (i.e., Markovianity violation). We acknowledge that a general proof establishing equivalence or a strict upper bound for arbitrary local architectures is not supplied and would constitute a substantial extension. We have therefore added a clarifying paragraph that distinguishes the definitional necessity from the Markovianity criterion, notes the empirical correlation observed for the tested local networks, and flags a rigorous general proof as an open question for future work. revision: partial

  2. Referee: [Numerical experiments] Numerical-validation section: the abstract states that the Markovianity criterion is validated on real-world datasets, yet supplies no quantitative details on how the transition point is located, what controls or baselines are used, or the observed effect sizes. Without these, it is unclear whether the rapid transition is observed independently of the definitional framework.

    Authors: We thank the referee for identifying this gap in presentation. The revised numerical-validation section now supplies the requested quantitative information: the procedure used to locate the transition via the spatial Markovianity measure, the control experiments and baselines (including direct comparisons against global architectures), and the measured effect sizes on denoising performance for the real-world datasets examined. These additions demonstrate that the rapid transition is detected consistently and is not an artifact of the phase-definition framework itself. revision: yes

Circularity Check

1 steps flagged

Phase definition via local connectivity makes 'local denoisers must fail' at transition largely definitional rather than independently demonstrated

specific steps
  1. self definitional [Abstract]
    "We define two distributions as belonging to the same data distribution phase if they can be mutually connected via spatially local operations such as local denoisers, along the same evolution path as the diffusion. We demonstrate that the reverse denoising process consists of an early trivial phase and a late data phase, sandwiching a rapid phase transition where local denoisers must fail."

    The demonstration that local denoisers must fail at the transition follows immediately from the preceding definition: distributions in different phases are precisely those that cannot be connected by local operations. Identifying a phase transition therefore entails the failure of local denoisers by the definition itself, rather than by an independent argument or external theorem.

full rationale

The paper's central claim that local denoisers must fail at the phase transition reduces directly to its own definition of phases as equivalence classes under spatially local operations along the diffusion path. By defining different phases as those not connectable by local denoisers, the failure at the transition point is true by construction once the transition is identified. The additional link to spatial Markovianity provides an operational diagnostic that is validated numerically on datasets, but does not independently derive the necessity claim from first principles outside the definitional framework. This produces partial circularity in the load-bearing step while leaving room for the empirical validation to add non-circular content.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on the definition of phases via local connectivity and the assumption that real data possesses low-dimensional spatial structure; no explicit free parameters or new particles are introduced in the abstract.

axioms (2)
  • domain assumption Real-life data such as images is spatially structured in low-dimensional spaces.
    Invoked in the opening paragraph to motivate local denoisers.
  • ad hoc to paper Two distributions belong to the same phase if they can be mutually connected via spatially local operations along the diffusion path.
    This is the central definitional axiom used to identify the phase transition.
invented entities (1)
  • data distribution phase no independent evidence
    purpose: To classify distributions according to whether they are reachable from each other by local denoisers during diffusion.
    New conceptual object introduced to organize the denoising trajectory; no independent falsifiable prediction is stated in the abstract.

pith-pipeline@v0.9.0 · 5821 in / 1575 out tokens · 28976 ms · 2026-05-18T23:30:17.774602+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Concurrence of Symmetry Breaking and Nonlocality Phase Transitions in Diffusion Models

    cs.LG 2026-05 unverdicted novelty 7.0

    Symmetry breaking and nonlocality phase transitions occur nearly simultaneously during diffusion model generation in modern transformers.

  2. Learning and Generating Mixed States Prepared by Shallow Channel Circuits

    quant-ph 2026-04 unverdicted novelty 7.0

    Any mixed state in the trivial phase can be efficiently learned and approximately generated by a shallow local channel circuit from polynomial measurements, without access to the original circuit.

  3. Learning and Generating Mixed States Prepared by Shallow Channel Circuits

    quant-ph 2026-04 unverdicted novelty 6.0

    Mixed states in the trivial phase can be approximately generated by a learned shallow local channel circuit from measurement copies alone, with polynomial sample and runtime complexity.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · cited by 2 Pith papers · 1 internal anchor

  1. [1]

    Y . Z. acknowledges support from NSF QuSEC-TAQS OSI 2326767. G. L. and X. G. acknowledge support from NSF PFC grant No. PHYS 2317149

  2. [2]

    Sohl-Dickstein, E

    J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Gan- guli, Deep unsupervised learning using nonequilibrium thermo- dynamics, inInternational Conference on Machine Learning, V ol. 37 (2015) pp. 2256–2265

  3. [3]

    Song and S

    Y . Song and S. Ermon, Generative modeling by estimating gra- dients of the data distribution, inAdvances in Neural Informa- tion Processing Systems, V ol. 32 (2019)

  4. [4]

    J. Ho, A. Jain, and P. Abbeel, Denoising diffusion probabilistic models, inAdvances in Neural Information Processing Systems, V ol. 33 (2020) pp. 6840–6851

  5. [5]

    J. Song, C. Meng, and S. Ermon, Denoising diffusion implicit models, inInternational Conference on Learning Representa- tions(2021)

  6. [6]

    Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Er- mon, and B. Poole, Score-based generative modeling through stochastic differential equations, inAdvances in Neural Infor- mation Processing Systems, V ol. 34 (2021)

  7. [7]

    Lipman, R

    Y . Lipman, R. T. Chen, H. Ben-Hamu, M. Nickel, and M. Le, Flow matching for generative modeling, inInternational Con- ference on Learning Representations(2022)

  8. [8]

    Midjourney, Inc., Midjourney (2022)

  9. [9]

    Stability AI, Stable Diffusion (2022)

  10. [10]

    OpenAI, DALL·E 3 (2023)

  11. [11]

    Google DeepMind, Imagen 4 (2025)

  12. [12]

    Hyv ¨arinen, Estimation of non-normalized statistical models by score matching, Journal of Machine Learning Research6, 695 (2005)

    A. Hyv ¨arinen, Estimation of non-normalized statistical models by score matching, Journal of Machine Learning Research6, 695 (2005)

  13. [13]

    Z. Wang, Y . Jiang, H. Zheng, P. Wang, P. He, Z. Wang, W. Chen, M. Zhou,et al., Patch diffusion: Faster and more data-efficient training of diffusion models, inAdvances in Neural Information Processing Systems, V ol. 36 (2023)

  14. [14]

    Z. Ding, M. Zhang, J. Wu, and Z. Tu, Patched denoising dif- fusion models for high-resolution image synthesis, inInterna- tional Conference on Learning Representations(2023)

  15. [15]

    An analytic theory of creativity in convolutional diffusion models

    M. Kamb and S. Ganguli, An analytic theory of creativity in convolutional diffusion models, arXiv:2412.20292 [cs.LG] (2024)

  16. [16]

    Niedoba, B

    M. Niedoba, B. Zwartsenberg, K. Murphy, and F. Wood, To- wards a mechanistic explanation of diffusion model generaliza- tion, arXiv:2411.19339 [cs.LG] (2024)

  17. [17]

    Chen, Z.-C

    X. Chen, Z.-C. Gu, and X.-G. Wen, Local unitary trans- formation, long-range quantum entanglement, wave function renormalization, and topological order, Physical Review B82, 155138 (2010)

  18. [18]

    Coser and D

    A. Coser and D. P ´erez-Garc´ıa, Classification of phases for mixed states via fast dissipative evolution, Quantum3, 174 (2019)

  19. [19]

    Sang and T

    S. Sang and T. H. Hsieh, Stability of mixed-state quantum phases via finite markov length, Physical Review Letters134, 070403 (2025)

  20. [20]

    Biroli, T

    G. Biroli, T. Bonnaire, V . de Bortoli, and M. M ´ezard, Dynam- ical regimes of diffusion models, Nature Communications15, 9957 (2024)

  21. [21]

    Raya and L

    G. Raya and L. Ambrogioni, Spontaneous symmetry breaking in generative diffusion models, inAdvances in Neural Informa- tion Processing Systems, V ol. 36 (2023)

  22. [22]

    Li and S

    M. Li and S. Chen, Critical windows: non-asymptotic theory for feature emergence in diffusion models, arXiv:2403.01633 [cs.LG] (2024)

  23. [23]

    Sclocchi, A

    A. Sclocchi, A. Favero, N. I. Levi, and M. Wyart, Probing the latent hierarchical structure of data via diffusion models, arXiv:2410.13770 [stat.ML] (2024)

  24. [24]

    Sclocchi, A

    A. Sclocchi, A. Favero, and M. Wyart, A phase transition in diffusion models reveals the hierarchical nature of data, arXiv:2402.16991 [stat.ML] (2024)

  25. [25]

    M. Li, A. Karan, and S. Chen, Blink of an eye: a simple theory for feature localization in generative models, arXiv:2502.00921 [cs.LG] (2025)

  26. [26]

    LeCun, C

    Y . LeCun, C. Cortes, and C. J. Burges, MNIST hand- written digit database,http://yann.lecun.com/exdb/ mnist/(1998)

  27. [27]

    Petz, Sufficient subalgebras and the relative entropy of states of a von neumann algebra, Communications in Mathematical Physics105, 123–131 (1986)

    D. Petz, Sufficient subalgebras and the relative entropy of states of a von neumann algebra, Communications in Mathematical Physics105, 123–131 (1986)

  28. [28]

    W. M. Mark,Quantum Information Theory(Cambridge Univer- sity Press, 2016)

  29. [29]

    Junge, R

    M. Junge, R. Renner, D. Sutter, M. M. Wilde, and A. Winter, Universal recovery maps and approximate sufficiency of quan- tum relative entropy, Annales Henri Poincar ´e19, 2955–2978 (2018)

  30. [30]

    H. Kwon, R. Mukherjee, and M.-S. Kim, Reversing lindblad dynamics via continuous petz recovery map, Physical Review Letters128, 020403 (2022)

  31. [31]

    B. D. Anderson, Reverse-time diffusion equation models, Stochastic Processes and their Applications12, 313–326 (1982)

  32. [32]

    Li and A

    K. Li and A. Winter, Squashed entanglement,k-extendibility, 8 quantum markov chains, and recovery maps, Foundations of Physics48, 910–924 (2018)

  33. [33]

    Fawzi and R

    O. Fawzi and R. Renner, Quantum conditional mutual informa- tion and approximate markov chains, Communications in Math- ematical Physics340, 575–611 (2015)

  34. [34]

    Zhang and S

    Y . Zhang and S. Gopalakrishnan, Conditional mutual informa- tion and information-theoretic phases of decohered gibbs states, arXiv:2502.13210 [quant-ph] (2025)

  35. [35]

    S. Sang. Private communications

  36. [36]

    Rosenblatt, Remarks on some nonparametric estimates of a density function, The Annals of Mathematical Statistics27, 832 (1956)

    M. Rosenblatt, Remarks on some nonparametric estimates of a density function, The Annals of Mathematical Statistics27, 832 (1956)

  37. [37]

    Parzen, On estimation of a probability density function and mode, The Annals of Mathematical Statistics33, 1065 (1962)

    E. Parzen, On estimation of a probability density function and mode, The Annals of Mathematical Statistics33, 1065 (1962)

  38. [38]

    Heitz, L

    E. Heitz, L. Belcour, and T. Chambon, Iterativeα-(de)blending: a minimalist deterministic diffusion model, inProceedings of ICLR 2023 / SIGGRAPH 2023 Conference Track(2023)

  39. [39]

    M. I. Belghazi, A. Baratin, S. Rajeshwar, S. Ozair, Y . Bengio, A. Courville, and D. Hjelm, Mutual information neural estima- tion, inInternational Conference on Machine Learning, V ol. 80 (2018) pp. 531–540

  40. [40]

    Ronneberger, P

    O. Ronneberger, P. Fischer, and T. Brox, U-net: Convolutional networks for biomedical image segmentation, inMedical Im- age Computing and Computer-Assisted Intervention (MICCAI) (Springer, 2015) pp. 234–241

  41. [41]

    Zhang, P

    B. Zhang, P. Xu, X. Chen, and Q. Zhuang, Generative quantum machine learning via denoising diffusion probabilistic models, Physical Review Letters132, 100602 (2024)

  42. [42]

    Xinyu Liu, Jingze Zhuang, and Yi-Zhuang You, in prepara- tion. This work also leverages the Petz map to perform quan- tum diffusion models, and proposes a concrete scheme of weak measurement-based classical shadow tomography to learn the Petz map

  43. [43]

    B. D. O. Anderson and I. B. Rhodes, Smoothing algorithms for nonlinear finite-dimensional systems, Stochastics9, 139–165 (1983)

  44. [44]

    H. Sun, L. Yu, B. Dai, D. Schuurmans, and H. Dai, Score-based continuous-time discrete diffusion models, arXiv:2211.16750 [cs.LG] (2022)

  45. [45]

    Sutter, M

    D. Sutter, M. Tomamichel, and A. W. Harrow, Strengthened monotonicity of relative entropy via pinched petz recovery map, in2016 IEEE International Symposium on Information Theory (ISIT)(IEEE, 2016) p. 760–764

  46. [46]

    M. D. Donsker and S. R. S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time. IV, Com- munications on Pure and Applied Mathematics30, 182 (1983)

  47. [47]

    S. Lu, M. Kan ´asz-Nagy, I. Kukuljan, and J. I. Cirac, Tensor networks and efficient descriptions of classical data, Physical Review A111, 032409 (2025)

  48. [48]

    Srivastava, G

    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: a simple way to prevent neural net- works from overfitting, J. Mach. Learn. Res.15, 1929–1958 (2014)

  49. [49]

    D. P. Kingma, Adam: A method for stochastic optimization, in International Conference on Learning Representations(2015)

  50. [50]

    Loshchilov and F

    I. Loshchilov and F. Hutter, Decoupled weight decay regular- ization, inInternational Conference on Learning Representa- tions (ICLR)(2019)

  51. [51]

    Lee and W

    K. Lee and W. Rhee, A benchmark suite for evaluating neu- ral mutual information estimators on unstructured datasets, in Advances in Neural Information Processing Systems(2025) pp. 46319–46338

  52. [52]

    FiLM: Visual Reasoning with a General Conditioning Layer

    E. Perez, F. Strub, H. de Vries, V . Dumoulin, and A. Courville, Film: Visual reasoning with a general conditioning layer, arXiv:1709.07871 [cs.CV] (2024)

  53. [53]

    M. S. Leifer and R. W. Spekkens, Towards a formulation of quantum theory as a causally neutral theory of bayesian infer- ence, Physical Review A88, 052130 (2013)

  54. [54]

    Khatri and M

    S. Khatri and M. M. Wilde, Principles of quantum communica- tion theory: A modern approach, arXiv:2011.04672 [quant-ph] (2020). 1 Supplementary Materials: Local Diffusion Models and Phases of Data Distributions CONTENTS S1 Derivation of Score-based Denoising from Bayes Formula 1 A Denoising for the continuous variable . . . . . . . . . . . . . . . . . . ....

  55. [55]

    We notice thatχ| ϵ=0 =ρ 1 2 and ∂ ∂ϵ (χ−1) ϵ=0 =−(χ| ϵ=0)−1 ∂χ ∂ϵ ϵ=0 (χ|ϵ=0)−1 =−ρ − 1 2 ∂χ ∂ϵ ϵ=0 ρ− 1 2 .(S62) 13 Then we only need to compute ∂χ ∂ϵ ϵ=0

    Derivative ofN(ρ) − 1 2 Now we letχ=N(ρ) 1 2 , namelyχ 2 =N(ρ). We notice thatχ| ϵ=0 =ρ 1 2 and ∂ ∂ϵ (χ−1) ϵ=0 =−(χ| ϵ=0)−1 ∂χ ∂ϵ ϵ=0 (χ|ϵ=0)−1 =−ρ − 1 2 ∂χ ∂ϵ ϵ=0 ρ− 1 2 .(S62) 13 Then we only need to compute ∂χ ∂ϵ ϵ=0 . Sinceχ 2 =N(ρ), we have χ ∂χ ∂ϵ + ∂χ ∂ϵ χ= ∂ ∂ϵ (N(ρ)).(S63) Here comes to the symmetric division atϵ= 0: the relation 1 2 n ρ 1 2 , ∂χ...

  56. [56]

    For any operatorτ, we have ∂ ∂ϵ (N †(τ)) ϵ=0 =L †(τ) =a †τ a− 1 2(a†aτ+τ a †a).(S68)

    Derivative ofN † This part is easy. For any operatorτ, we have ∂ ∂ϵ (N †(τ)) ϵ=0 =L †(τ) =a †τ a− 1 2(a†aτ+τ a †a).(S68)

  57. [57]

    B Continuous-time Twirled Petz Map Thetwirled Petz mapis defined as TN,ρ (σ) = Z ∞ −∞ f(θ)ρ 1−iθ 2 N † h N(ρ) −1+iθ 2 σN(ρ) −1−iθ 2 i ρ 1+iθ 2 ,(S75) wheref(θ) = π 2(cosh(πθ)+1)

    Derivative ofP N,ρ Now we can expandP N,ρ (σ)into PN,ρ (σ) =ρ 1 2 N(ρ) − 1 2 σN(ρ) − 1 2 +ϵL † N(ρ) − 1 2 σN(ρ) − 1 2 +O(ϵ 2) ρ 1 2 =ρ 1 2 ρ− 1 2 −ϵρ − 1 2 Lρ1/2 1 2 L(ρ) ρ− 1 2 +O(ϵ 2) σ ρ− 1 2 −ϵρ − 1 2 Lρ1/2 1 2 L(ρ) ρ− 1 2 +O(ϵ 2) ρ 1 2 +ϵρ 1 2 L† N(ρ) − 1 2 σN(ρ) − 1 2 ρ 1 2 +O(ϵ 2) =σ+ϵ −Lρ1/2 1 2 L(ρ) ρ− 1 2 σ−σρ − 1 2 Lρ1/2 1 2 L(ρ) +ρ 1 2 L† ρ− 1...

  58. [58]

    We notice thatχ θ|ϵ=0 =ρ 1−iθ 2 and ∂ ∂ϵ (χ−1 θ ) ϵ=0 =−(χ θ|ϵ=0)−1 ∂χθ ∂ϵ ϵ=0 (χθ|ϵ=0)−1 =−ρ −1+iθ 2 ∂χθ ∂ϵ ϵ=0 ρ −1+iθ 2 .(S77) Then we only need to computeκ θ = ∂χθ ∂ϵ ϵ=0

    Derivative ofN(ρ) −1+iθ 2 Now we letχ θ =N(ρ) 1−iθ 2 , namelyχ θχ† θ =N(ρ). We notice thatχ θ|ϵ=0 =ρ 1−iθ 2 and ∂ ∂ϵ (χ−1 θ ) ϵ=0 =−(χ θ|ϵ=0)−1 ∂χθ ∂ϵ ϵ=0 (χθ|ϵ=0)−1 =−ρ −1+iθ 2 ∂χθ ∂ϵ ϵ=0 ρ −1+iθ 2 .(S77) Then we only need to computeκ θ = ∂χθ ∂ϵ ϵ=0 . Sinceχ θχ† θ =χ † θχθ =N(ρ), we have χθ ∂χ† θ ∂ϵ + ∂χθ ∂ϵ χ† θ = ∂ ∂ϵ (N(ρ)),(S78) χ† θ ∂χθ ∂ϵ + ∂χ† θ ∂...

  59. [59]

    Consider a state with Wigner distributionW(x, p) = 1 2π P(x)

    Derivative ofT N,ρ LetT N,ρ (σ) = R ∞ −∞ dθ f(θ)R N,ρ,θ (σ), whereR N,ρ (σ)is the rotated Petz map, RN,ρ,θ (σ) =ρ 1−iθ 2 N(ρ) −1+iθ 2 σN(ρ) −1−iθ 2 +ϵL † N(ρ) −1+iθ 2 σN(ρ) −1−iθ 2 +O(ϵ 2) ρ 1+iθ 2 =ρ 1−iθ 2 ρ −1+iθ 2 −ϵL ρ1/2,θ ρ −1+iθ 2 L(ρ)ρ −1+iθ 2 +O(ϵ 2) σ ρ −1−iθ 2 −ϵL ρ1/2,−θ ρ −1−iθ 2 L(ρ)ρ −1−iθ 2 +O(ϵ 2) ρ 1+iθ 2 +ϵρ 1−iθ 2 L† N(ρ) −1+iθ 2 σN(ρ...

  60. [60]

    Firstly, ˆρ 1 2 ˆpˆρ− 1 2 |ψ⟩ ↔ √ P(−i∂ x) 1√ P ψ=−i ∂x − 1 2(∂x lnP) ψ,(S108) 19 namely ˆb↔b=−i ∂x − 1 2(∂x lnP)

    Dissipative term in continuous-time Petz map Now we can computeD[ ˆb]ˆσ=D h ˆρ 1 2 ˆpˆρ− 1 2 i ˆσwhereˆσ= R dx Q(x)|x⟩ ⟨x|. Firstly, ˆρ 1 2 ˆpˆρ− 1 2 |ψ⟩ ↔ √ P(−i∂ x) 1√ P ψ=−i ∂x − 1 2(∂x lnP) ψ,(S108) 19 namely ˆb↔b=−i ∂x − 1 2(∂x lnP) . Similarly, ˆρ− 1 2 ˆpˆρ 1 2 |ψ⟩ ↔ 1√ P (−i∂x) √ P ψ=−i ∂x + 1 2(∂x lnP) ψ,(S109) namely ˆb† ↔b=−i ∂x + 1 2(∂x lnP) . ...

  61. [61]

    Hamiltonian term in continuous-time Petz map Before computing−i[ ˆR, σ], we recall that ˆR=− i 2 Z dxdx′ p P(x)− p P(x ′)p P(x) + p P(x ′) ⟨x|ˆp2 + ˆb†ˆb|x ′⟩ |x⟩ ⟨x′|.(S116) We first check that ˆp2 |ψ⟩ ↔ −∂ 2 xψ,(S117) ˆb†ˆb|ψ⟩ ↔ −∂2 x + 1 2 s′ + 1 4 s2 ψ.(S118) We notice that Z dxdx′ p P(x)− p P(x ′)p P(x) + p P(x ′) ⟨x| 1 2 s′(x) + 1 4 s(x)2 |x′⟩ |x⟩ ⟨...

  62. [62]

    Final expression of continuous-time Petz map under decoherence limit Finally, we have (remembers=∂ x(lnP(x))is the score function) −i[ ˆR,ˆσ]|ψ⟩ ↔ −1 2 sQ′ ψ,(S124) D[ˆb]ˆσ|ψ⟩ ↔ −1 2 sQ′ −s ′Q+ 1 2 Q′′ ψ,(S125) (−i[ ˆR,ˆσ] +D[ˆb]ˆσ)|ψ⟩ ↔ −∂x(sQ) + 1 2 Q′′ ψ.(S126) We note here that both−i[ ˆR,ˆσ]andD[ ˆb]are not trace-class, but their summation is trace-c...