pith. machine review for the scientific record. sign in

arxiv: 1605.07648 · v4 · submitted 2016-05-24 · 💻 cs.CV

Recognition: unknown

FractalNet: Ultra-Deep Neural Networks without Residuals

Authors on Pith no claims yet
classification 💻 cs.CV
keywords networksdeepfractalneuralresidualsubnetworksanswershallow
0
0 comments X
read the original abstract

We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule generates deep networks whose structural layouts are precisely truncated fractals. These networks contain interacting subpaths of different lengths, but do not include any pass-through or residual connections; every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks. Rather, the key may be the ability to transition, during training, from effectively shallow to deep. We note similarities with student-teacher behavior and develop drop-path, a natural extension of dropout, to regularize co-adaptation of subpaths in fractal architectures. Such regularization allows extraction of high-performance fixed-depth subnetworks. Additionally, fractal networks exhibit an anytime property: shallow subnetworks provide a quick answer, while deeper subnetworks, with higher latency, provide a more accurate answer.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Understanding intermediate layers using linear classifier probes

    stat.ML 2016-10 accept novelty 7.0

    Linear probes demonstrate that feature separability for classification increases monotonically with network depth in Inception v3 and ResNet-50.

  2. mHC: Manifold-Constrained Hyper-Connections

    cs.CL 2025-12 unverdicted novelty 6.0

    mHC projects hyper-connection residual spaces onto a manifold to restore identity mapping, enabling stable large-scale training with performance gains over standard HC.

  3. YOLOv4: Optimal Speed and Accuracy of Object Detection

    cs.CV 2020-04 unverdicted novelty 5.0

    YOLOv4 achieves 43.5% AP (65.7% AP50) on MS COCO at ~65 FPS on Tesla V100 by integrating WRC, CSP, CmBN, SAT, Mish activation, Mosaic augmentation, DropBlock, and CIoU loss.