The role of permutation invariance in linear mode connectivity of neural networks.arXiv preprint arXiv:2110.06296

Entezari, R · 2022 · arXiv 2110.06296

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Editing Models with Task Arithmetic

cs.LG · 2022-12-08 · accept · novelty 8.0

Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

Motion-Compensated Weight Compression

cs.CV · 2026-05-23 · unverdicted · novelty 6.0

MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.

Unlocking the Potential of Continual Model Merging: An ODE Perspective

cs.LG · 2026-05-19 · unverdicted · novelty 6.0 · 3 refs

ODE-M formulates continual model merging as a barrier-aware ODE trajectory in parameter space, using first-order feedback and a utility-aware schedule to balance retained knowledge and new task performance.

Don't Stop Me Yet: Sampling Loss Minima via Dissipative Riemannian Mechanics

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.

The Platonic Representation Hypothesis

cs.LG · 2024-05-13 · unverdicted · novelty 5.0

Representations learned by large AI models are converging toward a shared statistical model of reality.

citing papers explorer

Showing 5 of 5 citing papers.

Editing Models with Task Arithmetic cs.LG · 2022-12-08 · accept · none · ref 23
Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.
Motion-Compensated Weight Compression cs.CV · 2026-05-23 · unverdicted · none · ref 11
MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.
Unlocking the Potential of Continual Model Merging: An ODE Perspective cs.LG · 2026-05-19 · unverdicted · none · ref 4 · 3 links
ODE-M formulates continual model merging as a barrier-aware ODE trajectory in parameter space, using first-order feedback and a utility-aware schedule to balance retained knowledge and new task performance.
Don't Stop Me Yet: Sampling Loss Minima via Dissipative Riemannian Mechanics cs.LG · 2026-05-14 · unverdicted · none · ref 42
DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.
The Platonic Representation Hypothesis cs.LG · 2024-05-13 · unverdicted · none · ref 78
Representations learned by large AI models are converging toward a shared statistical model of reality.

The role of permutation invariance in linear mode connectivity of neural networks.arXiv preprint arXiv:2110.06296

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer