Simple vs complex temporal recurrences for video saliency prediction

Panagiotis Linardos , Eva Mohedano , Juan Jose Nieto , Noel E. O'Connor , Xavier Giro-i-Nieto , Kevin McGuinness

Authors on Pith no claims yet

classification 💻 cs.CV cs.LG

keywords saliencyarchitecturepredictionrecurrencesresultssimpletemporalachieve

read the original abstract

This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain. The first modification is the addition of a ConvLSTM within the architecture, while the second is a conceptually simple exponential moving average of an internal convolutional state. We use weights pre-trained on the SALICON dataset and fine-tune our model on DHF1K. Our results show that both modifications achieve state-of-the-art results and produce similar saliency maps. Source code is available at https://git.io/fjPiB.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BIAS: A Biologically Inspired Algorithm for Video Saliency Detection
cs.CV 2026-04 unverdicted novelty 5.0

BIAS is a biologically inspired video saliency model that integrates static and motion features via retina-like detection and multi-Gaussian fitting, outperforming baselines on DHF1K and anticipating traffic accidents...
DiffAttn: Diffusion-Based Drivers' Visual Attention Prediction with LLM-Enhanced Semantic Reasoning
cs.CV 2026-03 unverdicted novelty 5.0

DiffAttn formulates driver visual attention prediction as a conditional diffusion-denoising task with Swin Transformer encoding, multi-scale fusion, and LLM semantic reasoning, achieving SoTA results on four datasets.