pith. sign in

hub

FiLM: Visual Reasoning with a General Conditioning Layer

27 Pith papers cite this work. Polarity classification is still indexing.

27 Pith papers citing it
abstract

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple, feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning - answering image-related questions which require a multi-step, high-level process - a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.

hub tools

citation-role summary

background 2 method 2

citation-polarity summary

representative citing papers

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Local Diffusion Models and Phases of Data Distributions

cs.LG · 2025-08-08 · unverdicted · novelty 6.0

The paper introduces a phase framework for data distributions connected by local denoisers and demonstrates that reverse diffusion consists of trivial and data phases separated by a transition where local score functions must fail, tied to spatial Markovianity.

AE-ViT: Stable Long-Horizon Parametric Partial Differential Equations Modeling

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

AE-ViT combines a convolutional autoencoder with a latent-space transformer and multi-stage parameter plus coordinate injection to deliver stable long-horizon predictions for parametric PDEs, cutting relative rollout error by roughly five times versus prior DL-ROMs and ViTs on advection-diffusion-re

citing papers explorer

Showing 27 of 27 citing papers.