Xception: Deep Learning with Depthwise Separable Convolutions

Fran\c{c}ois Chollet

arxiv: 1610.02357 · v3 · pith:LSTP655Ynew · submitted 2016-10-07 · 💻 cs.CV

Xception: Deep Learning with Depthwise Separable Convolutions

Fran\c{c}ois Chollet This is my paper

classification 💻 cs.CV

keywords inceptionconvolutiondepthwiseseparablearchitecturexceptionconvolutionalconvolutions

0 comments

read the original abstract

We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 13 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
cs.CV 2017-04 accept novelty 7.0

MobileNets introduce depthwise separable convolutions plus width and resolution multipliers to produce efficient CNNs that trade off latency and accuracy for mobile and embedded vision applications.
Deepfake Detection Generalization with Diffusion Noise
cs.CV 2026-04 unverdicted novelty 6.0

ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
On-chip probabilistic inference for charged-particle tracking at the sensor edge
physics.ins-det 2026-02 unverdicted novelty 6.0

Neural networks integrated into silicon sensor front-end electronics can regress charged-particle hit positions and angles with calibrated uncertainties from single-layer data while satisfying hardware constraints on ...
Separable Convolutional LSTMs for Faster Video Segmentation
cs.CV 2019-07 unverdicted novelty 6.0

Separable convLSTMs cut parameters and FLOPs in video segmentation, delivering up to 15% faster GPU inference with similar or slightly lower accuracy.
Rethinking Atrous Convolution for Semantic Image Segmentation
cs.CV 2017-06 unverdicted novelty 6.0

DeepLabv3 improves semantic segmentation by capturing multi-scale context with cascaded or parallel atrous convolutions and adding global context to ASPP, achieving better results on PASCAL VOC 2012 without DenseCRF p...
EPNAS: Efficient Progressive Neural Architecture Search
cs.LG 2019-07 unverdicted novelty 5.0

EPNAS uses a progressive search policy with REINFORCE performance prediction to search neural architectures in parallel, supporting multiple resource constraints and outperforming ENAS and PNAS on CIFAR-10 and ImageNe...
The Ethical Dilemma when (not) Setting up Cost-based Decision Rules in Semantic Segmentation
cs.CV 2019-07 unverdicted novelty 5.0

Defining egoistic and altruistic cost functions for class confusions in semantic segmentation changes precision, recall, and segment-wise error rates relative to standard MAP decisions.
Remote Estimation of Free-Flow Speeds
cs.CV 2019-06 unverdicted novelty 5.0

A CNN estimates free-flow speeds from aerial imagery and metadata, performing nearly as well with imagery alone as with road features.
Deep Single Image Deraining Via Estimating Transmission and Atmospheric Light in rainy Scenes
cs.CV 2019-06 unverdicted novelty 5.0

A deep network estimates per-image atmospheric light and a transmission map, then recovers a clear image from the atmospheric scattering model, outperforming prior deraining methods.
Attention Is All You Need
cs.CL 2017-06 unverdicted novelty 5.0

Pith review generated a malformed one-line summary.
DYMAPIA: A Multi-Domain Framework for Detecting AI-based Video Manipulation
cs.CV 2026-04 unverdicted novelty 4.0

DYMAPIA builds dynamic anomaly masks from Fourier spectra, texture, edges, and optical flow to guide a lightweight DistXCNet classifier, reporting over 99% accuracy and F1 on FF++, Celeb-DF, and VDFD.
Measuring the Transferability of Adversarial Examples
cs.LG 2019-07 unverdicted novelty 3.0

Empirical measurement of adversarial example transferability between VGG and Inception model classes with methodological refinements to attack strength selection, perturbation clipping, and evaluation via SSIM.
A Comprehensive Comparison of Deep Learning Architectures for COVID-19 Classification on CT & X-ray Imagery
cs.CV 2026-05 unverdicted novelty 2.0

ResNet and VGG models achieve 95-98% average accuracy distinguishing COVID-19 from normal lung images on X-ray and CT datasets using transfer learning from pre-trained networks.