BrainCause recovers known visual localizations and finds new candidate representations by validating causal specificity via counterfactual stimuli and encoding models, showing activation alone produces many false positives.
FLUX.2: Frontier Visual Intelligence
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 5years
2026 5roles
baseline 1polarities
baseline 1representative citing papers
Edit-Compass and EditReward-Compass are new unified benchmarks for fine-grained image editing evaluation and realistic reward modeling in reinforcement learning optimization.
KVBench reveals major gaps in current T2I models for knowledge-intensive tasks, and KE-Check narrows the gap between open- and closed-source models by adding structured knowledge and enforcing constraints.
A sparse set of massive activation channels in DiTs carries semantic information, proven critical by disruption tests, spatially aligned with image subjects via clustering, and transferable for semantic interpolation between prompts.
VAE-LFA suppresses semantic drift in multi-turn DiT image editing by low-pass filtering latent discrepancies and aligning low-frequency components to an EMA of previous rounds in VAE space.
citing papers explorer
-
From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain
BrainCause recovers known visual localizations and finds new candidate representations by validating causal specificity via counterfactual stimuli and encoding models, showing activation alone produces many false positives.
-
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
Edit-Compass and EditReward-Compass are new unified benchmarks for fine-grained image editing evaluation and realistic reward modeling in reinforcement learning optimization.
-
Knowledge Visualization: A Benchmark and Method for Knowledge-Intensive Text-to-Image Generation
KVBench reveals major gaps in current T2I models for knowledge-intensive tasks, and KE-Check narrows the gap between open- and closed-source models by adding structured knowledge and enforcing constraints.
-
Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers
A sparse set of massive activation channels in DiTs carries semantic information, proven critical by disruption tests, spatially aligned with image subjects via clustering, and transferable for semantic interpolation between prompts.
-
Why Do DiT Editors Drift? Plug-and-Play Low Frequency Alignment in VAE Latent Space
VAE-LFA suppresses semantic drift in multi-turn DiT image editing by low-pass filtering latent discrepancies and aligning low-frequency components to an EMA of previous rounds in VAE space.