pith. sign in

hub Mixed citations

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Mixed citation behavior. Most common role is background (67%).

86 Pith papers citing it
Background 67% of classified citations
abstract

For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at http://github.com/locuslab/TCN .

hub tools

citation-role summary

background 8 baseline 3 method 1

citation-polarity summary

claims ledger

  • abstract For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple conv

co-cited works

representative citing papers

Efficiently Modeling Long Sequences with Structured State Spaces

cs.LG · 2021-10-31 · unverdicted · novelty 8.0

S4 is an efficient state space sequence model that captures long-range dependencies via structured parameterization of the SSM, achieving state-of-the-art results on the Long Range Arena and other benchmarks while being faster than Transformers for generation.

GAFSV-Net: A Vision Framework for Online Signature Verification

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

GAFSV-Net encodes online signatures as asymmetric Gramian Angular Field images and processes them with dual-branch ConvNeXt plus cross-attention to outperform sequence-based baselines on DeepSignDB and BiosecurID.

Adversarial Robustness of Deep State Space Models for Forecasting

cs.LG · 2026-04-03 · conditional · novelty 7.0

Spacetime SSM forecasters represent optimal Kalman predictors for autoregressive data but remain vulnerable to model-free attacks that exploit local linearity and increase error by over 33% compared to projected gradient descent.

Causal Time Series Generation via Diffusion Models

cs.LG · 2025-09-25 · unverdicted · novelty 7.0

CaTSG is a unified diffusion model for causal time series generation that handles observational, interventional, and counterfactual tasks via backdoor adjustment and abduction-action-prediction.

ReactiveGWM: Steering NPC in Reactive Game World Models

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

ReactiveGWM introduces a decoupled diffusion architecture for player-NPC interactions that learns game-agnostic response logic for zero-shot strategy transfer across games.

citing papers explorer

Showing 50 of 86 citing papers.