pith. sign in

arxiv: 1906.06637 · v1 · pith:7QGH23A4new · submitted 2019-06-16 · 💻 cs.LG · math.OC· stat.ML

A Closer Look at Double Backpropagation

classification 💻 cs.LG math.OCstat.ML
keywords backpropagationderivativesdescriptiondoublefunctionsgeneralinputsloss
0
0 comments X p. Extension
pith:7QGH23A4 Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{7QGH23A4}

Prints a linked pith:7QGH23A4 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In recent years, an increasing number of neural network models have included derivatives with respect to inputs in their loss functions, resulting in so-called double backpropagation for first-order optimization. However, so far no general description of the involved derivatives exists. Here, we cover a wide array of special cases in a very general Hilbert space framework, which allows us to provide optimized backpropagation rules for many real-world scenarios. This includes the reduction of calculations for Frobenius-norm-penalties on Jacobians by roughly a third for locally linear activation functions. Furthermore, we provide a description of the discontinuous loss surface of ReLU networks both in the inputs and the parameters and demonstrate why the discontinuities do not pose a big problem in reality.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Layer-wise Derivative Controlled Networks

    cs.LG 2026-05 unverdicted novelty 4.0

    ChainzRule with DREG regularization claims 15.5x fewer parameters than standard models, 23.1% lower peak gradient volatility on MNIST, and 70.17% accuracy on Yelp Full ordinal regression.