Learning spatiotemporal features with 3d convolutional networks

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation

cs.CV · 2026-04-03 · conditional · novelty 7.0

STRNet improves goal-conditioned visual navigation by replacing simplistic encoders and pooling with a spatio-temporal fusion module that performs spatial graph reasoning and hybrid temporal modeling.

B-MoE: A Body-Part-Aware Mixture-of-Experts "All Parts Matter" Approach to Micro-Action Recognition

cs.CV · 2026-03-25 · unverdicted · novelty 7.0

B-MoE framework achieves state-of-the-art performance on micro-action recognition by using region-specific experts and cross-attention routing.

The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection

cs.CV · 2025-12-23 · unverdicted · novelty 6.0

KeyTailor improves video virtual try-on realism by using instruction-guided keyframes to enhance garment details and background integrity in DiT models without major architectural changes.

citing papers explorer

Showing 3 of 3 citing papers.

STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation cs.CV · 2026-04-03 · conditional · none · ref 40
STRNet improves goal-conditioned visual navigation by replacing simplistic encoders and pooling with a spatio-temporal fusion module that performs spatial graph reasoning and hybrid temporal modeling.
B-MoE: A Body-Part-Aware Mixture-of-Experts "All Parts Matter" Approach to Micro-Action Recognition cs.CV · 2026-03-25 · unverdicted · none · ref 32
B-MoE framework achieves state-of-the-art performance on micro-action recognition by using region-specific experts and cross-attention routing.
The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection cs.CV · 2025-12-23 · unverdicted · none · ref 34
KeyTailor improves video virtual try-on realism by using instruction-guided keyframes to enhance garment details and background integrity in DiT models without major architectural changes.

Learning spatiotemporal features with 3d convolutional networks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer