pith. sign in

arxiv: 1611.05267 · v1 · pith:EMMGIVJVnew · submitted 2016-11-16 · 💻 cs.CV

Temporal Convolutional Networks for Action Segmentation and Detection

classification 💻 cs.CV
keywords temporalactionfine-grainednetworksconvolutionalconvolutionsdetectiondilated
0
0 comments X
read the original abstract

The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond. Typical approaches decouple this problem by first extracting local spatiotemporal features from video frames and then feeding them into a temporal classifier that captures high-level temporal patterns. We introduce a new class of temporal models, which we call Temporal Convolutional Networks (TCNs), that use a hierarchy of temporal convolutions to perform fine-grained action segmentation or detection. Our Encoder-Decoder TCN uses pooling and upsampling to efficiently capture long-range temporal patterns whereas our Dilated TCN uses dilated convolutions. We show that TCNs are capable of capturing action compositions, segment durations, and long-range dependencies, and are over a magnitude faster to train than competing LSTM-based Recurrent Neural Networks. We apply these models to three challenging fine-grained datasets and show large improvements over the state of the art.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. QuChaTeR: A Hybrid Quantum-Chaotic Temporal Framework for Earthquake Prediction

    cs.LG 2026-05 unverdicted novelty 4.0

    QuChaTeR hybridizes chaotic maps and variational quantum circuits with recurrent networks and wavelets to achieve faster convergence and better performance than classical and quantum-inspired baselines on real seismic...