pith. sign in

The role of permutation invariance in linear mode connectivity of neural networks.arXiv preprint arXiv:2110.06296

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 4 cs.CV 1

roles

background 1

polarities

background 1

clear filters

representative citing papers

Editing Models with Task Arithmetic

cs.LG · 2022-12-08 · accept · novelty 8.0

Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

Motion-Compensated Weight Compression

cs.CV · 2026-05-23 · unverdicted · novelty 6.0

MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.

The Platonic Representation Hypothesis

cs.LG · 2024-05-13 · unverdicted · novelty 5.0

Representations learned by large AI models are converging toward a shared statistical model of reality.

citing papers explorer

Showing 5 of 5 citing papers.

  • Editing Models with Task Arithmetic cs.LG · 2022-12-08 · accept · none · ref 23

    Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.

  • Motion-Compensated Weight Compression cs.CV · 2026-05-23 · unverdicted · none · ref 11

    MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.

  • Unlocking the Potential of Continual Model Merging: An ODE Perspective cs.LG · 2026-05-19 · unverdicted · none · ref 4 · 3 links

    ODE-M formulates continual model merging as a barrier-aware ODE trajectory in parameter space, using first-order feedback and a utility-aware schedule to balance retained knowledge and new task performance.

  • Don't Stop Me Yet: Sampling Loss Minima via Dissipative Riemannian Mechanics cs.LG · 2026-05-14 · unverdicted · none · ref 42

    DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.

  • The Platonic Representation Hypothesis cs.LG · 2024-05-13 · unverdicted · none · ref 78

    Representations learned by large AI models are converging toward a shared statistical model of reality.