pith. sign in

arxiv: 1807.00053 · v2 · pith:B2XKAQ5Xnew · submitted 2018-06-20 · 🧬 q-bio.NC · cs.AI· cs.CV· cs.LG· cs.NE

Task-Driven Convolutional Recurrent Models of the Visual System

classification 🧬 q-bio.NC cs.AIcs.CVcs.LGcs.NE
keywords visualareascnnsrecurrencerecurrentsystembraincells
0
0 comments X
read the original abstract

Feed-forward convolutional neural networks (CNNs) are currently state-of-the-art for object classification tasks such as ImageNet. Further, they are quantitatively accurate models of temporally-averaged responses of neurons in the primate brain's visual system. However, biological visual systems have two ubiquitous architectural features not shared with typical CNNs: local recurrence within cortical areas, and long-range feedback from downstream areas to upstream areas. Here we explored the role of recurrence in improving classification performance. We found that standard forms of recurrence (vanilla RNNs and LSTMs) do not perform well within deep CNNs on the ImageNet task. In contrast, novel cells that incorporated two structural features, bypassing and gating, were able to boost task accuracy substantially. We extended these design principles in an automated search over thousands of model architectures, which identified novel local recurrent cells and long-range feedback connections useful for object recognition. Moreover, these task-optimized ConvRNNs matched the dynamics of neural activity in the primate visual system better than feedforward networks, suggesting a role for the brain's recurrent connections in performing difficult visual behaviors.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Self-organized MT Direction Maps Emerge from Spatiotemporal Contrastive Optimization

    q-bio.NC 2026-05 unverdicted novelty 6.0

    Direction maps and pinwheel structures in MT emerge spontaneously when a spatiotemporal deep network is trained on videos with contrastive self-supervised learning and spatial regularization.