pith. sign in

arxiv: 1902.04502 · v1 · pith:4V7WEXNOnew · submitted 2019-02-12 · 💻 cs.CV

Fast-SCNN: Fast Semantic Segmentation Network

classification 💻 cs.CV
keywords segmentationnetworkresolutioncomputationfastsemanticcityscapesdata
0
0 comments X
read the original abstract

The encoder-decoder framework is state-of-the-art for offline semantic image segmentation. Since the rise in autonomous systems, real-time computation is increasingly desirable. In this paper, we introduce fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data (1024x2048px) suited to efficient computation on embedded devices with low memory. Building on existing two-branch methods for fast segmentation, we introduce our `learning to downsample' module which computes low-level features for multiple resolution branches simultaneously. Our network combines spatial detail at high resolution with deep features extracted at lower resolution, yielding an accuracy of 68.0% mean intersection over union at 123.5 frames per second on Cityscapes. We also show that large scale pre-training is unnecessary. We thoroughly validate our metric in experiments with ImageNet pre-training and the coarse labeled data of Cityscapes. Finally, we show even faster computation with competitive results on subsampled inputs, without any network modifications.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LightAVSeg: Lightweight Audio-Visual Segmentation

    cs.CV 2026-05 unverdicted novelty 6.0

    LightAVSeg decouples semantic filtering and spatial grounding to achieve linear-cost cross-modal interaction in audio-visual segmentation, reaching 50.4 mIoU on MS3 with 20.5M parameters as a new lightweight state-of-the-art.

  2. Attention-Mamba: A Mamba-Enhanced Multi-Scale Parallel Inference Network for Medical Image Segmentation

    cs.CV 2024-02 unverdicted novelty 5.0

    Attention-Mamba uses parallel branches, Recursive Alignment Module, and Mamba-enhanced attention to report highest segmentation accuracy on Synapse, ACDC, ISIC-2018, and PH2 with 14.05M parameters and 8.94 GFLOPs.