Reversible Residual Normalization (RRN) introduces spatially-aware invertible residual blocks that combine center normalization with spectral-constrained graph convolutions to mitigate spatio-temporal distribution shifts in graph forecasting.
Lipsformer: Introducing lipschitz continuity to vision transformers.ArXiv, abs/2304.09856
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Sinkhorn-normalized doubly stochastic attention preserves rank more effectively than Softmax row-stochastic attention, with both showing doubly exponential rank decay to one with network depth.
citing papers explorer
-
Reversible Residual Normalization Alleviates Spatio-Temporal Distribution Shift
Reversible Residual Normalization (RRN) introduces spatially-aware invertible residual blocks that combine center normalization with spectral-constrained graph convolutions to mitigate spatio-temporal distribution shifts in graph forecasting.
-
Sinkhorn doubly stochastic attention rank decay analysis
Sinkhorn-normalized doubly stochastic attention preserves rank more effectively than Softmax row-stochastic attention, with both showing doubly exponential rank decay to one with network depth.