pith. sign in

arxiv: 2511.09016 · v2 · submitted 2025-11-12 · 📡 eess.SY · cs.LG· cs.SY

Assumed Density Filtering and Smoothing with Neural Network Surrogate Models

Pith reviewed 2026-05-17 22:50 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SY
keywords assumed density filteringneural network surrogate modelsuncertainty propagationanalytic momentsstate estimationKalman filterRTS smoothernonlinear filtering
0
0 comments X

The pith

Assumed density filters and smoothers can propagate uncertainty accurately through neural network surrogate models using closed-form mean and covariance formulas.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that neural networks can serve as accurate surrogate models inside assumed-density filtering and smoothing by replacing sampling with an analytic expression for the network output moments when the input is Gaussian. This enables uncertainty to be tracked through nonlinear dynamics or measurement functions without the computational cost of particle methods. The authors demonstrate the approach on a stochastic Lorenz system and a Wiener system, where it yields better state estimates than standard alternatives. They further show that these estimates improve the performance of linear quadratic regulation when used in feedback. Cross-entropy is presented as a more suitable metric than RMSE for judging the quality of the resulting Gaussian approximations.

Core claim

By inserting a state-of-the-art analytic formula that computes the mean and covariance of a deep neural network given Gaussian inputs directly into the assumed-density recursion, the Kalman filter and Rauch-Tung-Striebel smoother can be applied to nonlinear systems whose transition or output functions are represented by neural networks, producing accurate Gaussian state estimates without Monte Carlo sampling.

What carries the argument

The analytic formula for the mean and covariance of a deep neural network output given a Gaussian input distribution, inserted into the assumed-density recursion steps of the filter and smoother.

If this is right

  • State estimates in neural-network-modeled nonlinear systems become available at lower computational cost than sampling-based filters.
  • Linear quadratic regulation achieves lower cost when the state feedback uses the analytic-moment estimates rather than sampling-based ones.
  • Cross-entropy between the approximate and true posterior distributions exposes performance differences that RMSE misses on the Lorenz and Wiener examples.
  • The same moment formula can be reused for both filtering and RTS smoothing passes without additional sampling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The technique could be combined with learned neural dynamics models that are trained to preserve the Gaussian-input assumption for better long-horizon prediction.
  • In embedded control systems, the absence of sampling may enable faster real-time execution while retaining uncertainty awareness.
  • Similar moment-matching ideas might extend to other Gaussian-approximation filters if closed-form expressions exist for their surrogate components.

Load-bearing premise

The analytic formula for neural network output moments remains sufficiently accurate when used repeatedly inside the assumed-density filter and smoother on the test systems.

What would settle it

On the stochastic Lorenz system, compare the filter's propagated state covariance against the empirical covariance obtained from a large number of Monte Carlo trajectory simulations; a large mismatch would show the approximation has degraded.

Figures

Figures reproduced from arXiv: 2511.09016 by Simon Kuang, Xinfan Lin.

Figure 1
Figure 1. Figure 1: Trajectory excerpt for Kalman filter ANALYTIC in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p031_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Coverage for Kalman filter ANALYTIC in the Lorenz system state estimation problem. Closer to identity is best. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Trajectory excerpt for Kalman filter ANALYTIC (RECAL) in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p032_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Coverage for Kalman filter ANALYTIC (RECAL) in the Lorenz system state estimation problem. Closer to identity is best. 32 [PITH_FULL_IMAGE:figures/full_fig_p032_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Trajectory excerpt for Kalman filter MEAN-FIELD in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p033_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Coverage for Kalman filter MEAN-FIELD in the Lorenz system state estimation problem. Closer to identity is best. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Trajectory excerpt for Kalman filter MEAN-FIELD (RECAL) in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p034_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Coverage for Kalman filter MEAN-FIELD (RECAL) in the Lorenz system state estimation problem. Closer to identity is best. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Trajectory excerpt for Kalman filter LINEAR in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Coverage for Kalman filter LINEAR in the Lorenz system state estimation problem. Closer to identity is best. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Trajectory excerpt for Kalman filter LINEAR (RECAL) in the Lorenz system state estima￾tion problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p036_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Coverage for Kalman filter LINEAR (RECAL) in the Lorenz system state estimation problem. Closer to identity is best. 36 [PITH_FULL_IMAGE:figures/full_fig_p036_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Trajectory excerpt for Kalman filter UNSCENTED’95 in the Lorenz system state estima￾tion problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p037_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Coverage for Kalman filter UNSCENTED’95 in the Lorenz system state estimation prob￾lem. Closer to identity is best. 37 [PITH_FULL_IMAGE:figures/full_fig_p037_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Trajectory excerpt for Kalman filter UNSCENTED’95 (RECAL) in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p038_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Coverage for Kalman filter UNSCENTED’95 (RECAL) in the Lorenz system state estima￾tion problem. Closer to identity is best. 38 [PITH_FULL_IMAGE:figures/full_fig_p038_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Trajectory excerpt for Kalman filter UNSCENTED’02 in the Lorenz system state estima￾tion problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p039_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Coverage for Kalman filter UNSCENTED’02 in the Lorenz system state estimation prob￾lem. Closer to identity is best. 39 [PITH_FULL_IMAGE:figures/full_fig_p039_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Trajectory excerpt for Kalman filter UNSCENTED’02 (RECAL) in the Lorenz system state estimation problem. Plus sign indicates hit; cross indicates miss: 10 crosses is best for nominal coverage. 0.00 0.25 0.50 0.75 1.00 Nominal coverage 0.00 0.25 0.50 0.75 1.00 Actual coverage prediction filtering smoothing [PITH_FULL_IMAGE:figures/full_fig_p040_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Coverage for Kalman filter UNSCENTED’02 (RECAL) in the Lorenz system state estima￾tion problem. Closer to identity is best. 40 [PITH_FULL_IMAGE:figures/full_fig_p040_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Trajectory excerpt for Kalman filter ANALYTIC in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 4950 5000 5050 Time step 25 0 25 x1 Prediction (t|t 1) 4950 5000 5050 Time step 25 0 25 x1 Filtering (t|t) 4950 5000 5050 Time step 25 0 25 x1 Smoothing (t|T) [PITH_FULL_IMAGE:figures/full_fig_p045_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Trajectory excerpt for Kalman filter ANALYTIC (RECAL) in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 4950 5000 5050 Time step 25 0 25 x1 Prediction (t|t 1) 4950 5000 5050 Time step 25 0 25 x1 Filtering (t|t) 4950 5000 5050 Time step 25 0 25 x1 Smoothing (t|T) [PITH_FULL_IMAGE:figures/full_fig_p045_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Trajectory excerpt for Kalman filter LINEAR in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 45 [PITH_FULL_IMAGE:figures/full_fig_p045_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Trajectory excerpt for Kalman filter LINEAR (RECAL) in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 4950 5000 5050 Time step 25 0 25 x1 Prediction (t|t 1) 4950 5000 5050 Time step 25 0 25 x1 Filtering (t|t) 4950 5000 5050 Time step 25 0 25 x1 Smoothing (t|T) [PITH_FULL_IMAGE:figures/full_fig_p046_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Trajectory excerpt for Kalman filter UNSCENTED’95 in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 4950 5000 5050 Time step 25 0 25 x1 Prediction (t|t 1) 4950 5000 5050 Time step 25 0 25 x1 Filtering (t|t) 4950 5000 5050 Time step 25 0 25 x1 Smoothing (t|T) [PITH_FULL_IMAGE:figures/full_fig_p046_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Trajectory excerpt for Kalman filter UNSCENTED’95 (RECAL) in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 46 [PITH_FULL_IMAGE:figures/full_fig_p046_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Trajectory excerpt for Kalman filter UNSCENTED’02 in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 4950 5000 5050 Time step 25 0 25 x1 Prediction (t|t 1) 4950 5000 5050 Time step 25 0 25 x1 Filtering (t|t) 4950 5000 5050 Time step 25 0 25 x1 Smoothing (t|T) [PITH_FULL_IMAGE:figures/full_fig_p047_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Trajectory excerpt for Kalman filter UNSCENTED’02 (RECAL) in the LTI state estimation problem. Plus sign indicates hit; cross indicates miss. Expect 10 misses for nominal 90% coverage. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_28.png] view at source ↗
read the original abstract

The Kalman filter and Rauch-Tung-Striebel (RTS) smoother are optimal for state estimation in linear dynamic systems. With nonlinear systems, the challenge consists in how to propagate uncertainty through the state transitions and output function. For the case of a neural network model, we enable accurate uncertainty propagation using a recent state-of-the-art analytic formula for computing the mean and covariance of a deep neural network with Gaussian input. We argue that cross entropy is a more appropriate performance metric than RMSE for evaluating the accuracy of filters and smoothers. We demonstrate the superiority of our method for state estimation on a stochastic Lorenz system and a Wiener system, and find that our method enables more optimal linear quadratic regulation when the state estimate is used for feedback. Code available at https: //github.com/simontheflutist/analytic-moments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that assumed-density filtering and smoothing can be performed accurately for nonlinear systems whose dynamics or observations are represented by trained neural-network surrogates by substituting a recent analytic formula for the exact mean and covariance of a DNN output under Gaussian input. The method is asserted to outperform standard baselines on a stochastic Lorenz system and a Wiener system when cross-entropy (rather than RMSE) is used as the performance metric, and to yield improved closed-loop LQR performance when the resulting state estimates are used for feedback.

Significance. If the analytic moment formula remains sufficiently accurate when the network is a learned surrogate and its output Gaussians are re-injected as inputs at subsequent time steps, the approach would supply a deterministic, low-variance alternative to sampling-based moment matching inside the ADF recursion. This would be valuable for real-time nonlinear estimation and control tasks that already employ neural surrogates.

major comments (2)
  1. [Abstract and experimental results] The central claim that the cited analytic formula yields accurate moments inside the assumed-density recursion rests on an untested assumption: that any truncation or approximation error in the formula does not compound when the DNN is a trained surrogate whose outputs become the next-step inputs. No Monte-Carlo moment estimates on the trained networks, no error-bound analysis, and no ablation isolating the contribution of the analytic moments versus architecture or training loss are reported.
  2. [Abstract] The superiority statements for the Lorenz and Wiener examples are presented without quantitative comparison tables, error bars, or explicit baseline implementations (e.g., UKF, particle filter, or Monte-Carlo ADF). Because the performance numbers depend on the external analytic formula remaining exact under recursion, the lack of such verification is load-bearing for the main result.
minor comments (2)
  1. The abstract states that cross-entropy is a more appropriate metric than RMSE; the precise definition and normalization used in the reported experiments should be given explicitly.
  2. The GitHub link for code is welcome for reproducibility; the repository should include the exact trained network weights and the scripts that regenerate the reported filter trajectories.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which highlight important aspects of validating the recursive application of the analytic moment formula and the presentation of our experimental results. We address each point below and will incorporate revisions to provide additional verification and quantitative comparisons.

read point-by-point responses
  1. Referee: [Abstract and experimental results] The central claim that the cited analytic formula yields accurate moments inside the assumed-density recursion rests on an untested assumption: that any truncation or approximation error in the formula does not compound when the DNN is a trained surrogate whose outputs become the next-step inputs. No Monte-Carlo moment estimates on the trained networks, no error-bound analysis, and no ablation isolating the contribution of the analytic moments versus architecture or training loss are reported.

    Authors: We agree that the manuscript would be strengthened by explicit checks on moment accuracy for the trained networks in the recursive setting. While the analytic formula provides exact mean and covariance for a single forward pass through the network given a Gaussian input, the referee correctly notes that we have not reported Monte-Carlo validations or error bounds for the multi-step recursion with surrogate outputs. In our experiments, the method's effectiveness is evidenced by improved state estimation accuracy (via cross-entropy) and better LQR performance compared to baselines. To directly address this, we will add Monte-Carlo moment estimates computed on the trained surrogate networks during the filtering and smoothing recursions for both the Lorenz and Wiener systems. We will also include an ablation study that isolates the effect of using analytic moments versus a sampling-based approach, and a short discussion on potential error accumulation. These additions will be included in a revised version of the experimental section. revision: yes

  2. Referee: [Abstract] The superiority statements for the Lorenz and Wiener examples are presented without quantitative comparison tables, error bars, or explicit baseline implementations (e.g., UKF, particle filter, or Monte-Carlo ADF). Because the performance numbers depend on the external analytic formula remaining exact under recursion, the lack of such verification is load-bearing for the main result.

    Authors: We acknowledge the value of presenting the results with quantitative tables, error bars, and explicit baseline comparisons to make the claims more robust. The original manuscript demonstrates superiority through figures and the cross-entropy metric on the stochastic Lorenz and Wiener systems, and shows improved LQR performance. However, we agree that including numerical tables with means and standard deviations from multiple runs, as well as direct implementations of the UKF, particle filter, and a Monte-Carlo ADF baseline, would provide clearer evidence. We will revise the manuscript to add these comparison tables and baseline details in the experimental results section, ensuring fair comparisons under the same conditions. revision: yes

Circularity Check

0 steps flagged

No circularity: central method applies external analytic formula to standard ADF recursion

full rationale

The paper's derivation chain consists of (1) training a neural network surrogate for the dynamics or observation map, (2) invoking a cited state-of-the-art analytic formula to compute exact mean and covariance of the network output under Gaussian input, and (3) inserting those moments into the standard assumed-density filter/smoother recursion. None of these steps reduces to a quantity defined inside the paper by construction; the analytic moment formula is presented as prior work, the ADF closure is the classical one, and the reported filter performance on the Lorenz and Wiener examples is an empirical outcome rather than a tautological re-expression of fitted parameters. No self-definitional equations, fitted-input predictions, or load-bearing self-citations appear in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the standard assumed-density Gaussian approximation at each step and on the accuracy of the external analytic formula for neural-network moments; no new free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption State distribution remains approximately Gaussian after each nonlinear transition (core ADF assumption)
    Invoked when the analytic moments are inserted into the filter recursion; limits accuracy for strongly non-Gaussian posteriors.
  • domain assumption The recent analytic formula accurately computes mean and covariance of the neural-network output for Gaussian inputs
    Central to the uncertainty-propagation step; accuracy depends on the cited prior work and network architecture.

pith-pipeline@v0.9.0 · 5441 in / 1350 out tokens · 49161 ms · 2026-05-17T22:50:22.364079+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    For any σ, the exact moments of a single layer are given by three transcendental functions Mσ, Kσ, Lσ corresponding to bivariate Gaussian integrals... This paper introduces the ANALYTIC Kalman filter, in which 'N' is implemented using the layer-by-layer moment matching method introduced in Anonymous (2025)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages · 1 internal anchor

  1. [1]

    URL https://ieeexplore.ieee.org/ document/10787252

    doi:10.1109/LCSYS.2024.3514818. URL https://ieeexplore.ieee.org/ document/10787252. Anonymous. Closed-form uncertainty quantification of deep residual neural networks. InSubmitted to The Fourteenth International Conference on Learning Representations, 2025. URL https: //openreview.net/forum?id=CHWjetQzYS. under review. Kumar Anurag, Kasra Azizi, Francesco...

  2. [2]

    Belgacem, J.-L

    doi:10.1109/CDC42340.2020.9303764. URL https://ieeexplore.ieee.org/ abstract/document/9303764. ISSN: 2576-2370. Shida Jiang, Junzhe Shi, and Scott Moura. A New Framework for Nonlinear Kalman Filters, February

  3. [3]

    Mitigating Overconfidence in Nonlinear Kalman Filters via Covariance Recalibration

    URLhttp://arxiv.org/abs/2407.05717. arXiv:2407.05717 [eess]. S. Julier, J. Uhlmann, and H.F. Durrant-Whyte. A new method for the nonlinear transformation of means and covariances in filters and estimators.IEEE Transactions on Automatic Control, 45(3): 477–482, March 2000. ISSN 0018-9286. doi:10.1109/9.847726. URL http://ieeexplore. ieee.org/document/84772...

  4. [4]

    Interpretability in mapping weeds and crops from drone images

    doi:10.1109/IJCNN60899.2024.10650331. URL https://ieeexplore.ieee.org/ abstract/document/10650331. ISSN: 2161-4407. Nima Mohajerin and Steven L. Waslander. Multistep Prediction of Dynamic Systems With Recurrent Neural Networks.IEEE Transactions on Neural Networks and Learning Systems, 30(11):3370– 3383, November 2019. ISSN 2162-2388. doi:10.1109/TNNLS.201...

  5. [5]

    In: 2022 IEEE 61st Conference on Decision and Control (CDC)

    doi:10.1109/CDC51059.2022.9993245. URL https://ieeexplore.ieee.org/ document/9993245/. ISSN: 2576-2370. Kumpati S. Narendra and Kannan Parthasarathy. Neural networks and dynamical systems. International Journal of Approximate Reasoning, 6(2):109–131, February 1992. ISSN 0888613X. doi:10.1016/0888-613X(92)90014-Q. URL https://linkinghub.elsevier. com/retri...

  6. [6]

    URL http://link.springer.com/10.1007/ 978-0-387-21736-9

    doi:10.1007/978-0-387-21736-9. URL http://link.springer.com/10.1007/ 978-0-387-21736-9. Yuezhu Xu and S. Sivaranjani. ECLipsE: Efficient Compositional Lipschitz Constant Estimation for Deep Neural Networks. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Infor- mation Processing Systems, vol...

  7. [7]

    15 KUANGLIN Supplementary material Contents 1 Introduction 1 2 Notation 2 3 Problem statement 3 4 Neural networks 4 4.1 The identity-augmentation operator

    URL https://proceedings.neurips.cc/paper_files/paper/2024/ file/1419d8554191a65ea4f2d8e1057973e4-Paper-Conference.pdf. 15 KUANGLIN Supplementary material Contents 1 Introduction 1 2 Notation 2 3 Problem statement 3 4 Neural networks 4 4.1 The identity-augmentation operator . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.2 Methods for uncertainty p...