pith. sign in

A Closer Look at Double Backpropagation

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

In recent years, an increasing number of neural network models have included derivatives with respect to inputs in their loss functions, resulting in so-called double backpropagation for first-order optimization. However, so far no general description of the involved derivatives exists. Here, we cover a wide array of special cases in a very general Hilbert space framework, which allows us to provide optimized backpropagation rules for many real-world scenarios. This includes the reduction of calculations for Frobenius-norm-penalties on Jacobians by roughly a third for locally linear activation functions. Furthermore, we provide a description of the discontinuous loss surface of ReLU networks both in the inputs and the parameters and demonstrate why the discontinuities do not pose a big problem in reality.

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Layer-wise Derivative Controlled Networks

cs.LG · 2026-05-14 · unverdicted · novelty 4.0

ChainzRule with DREG regularization claims 15.5x fewer parameters than standard models, 23.1% lower peak gradient volatility on MNIST, and 70.17% accuracy on Yelp Full ordinal regression.

citing papers explorer

Showing 1 of 1 citing paper.

  • Layer-wise Derivative Controlled Networks cs.LG · 2026-05-14 · unverdicted · none · ref 7 · internal anchor

    ChainzRule with DREG regularization claims 15.5x fewer parameters than standard models, 23.1% lower peak gradient volatility on MNIST, and 70.17% accuracy on Yelp Full ordinal regression.