Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.
The role of permutation invariance in linear mode connectivity of neural networks.arXiv preprint arXiv:2110.06296
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.
ODE-M formulates continual model merging as a barrier-aware ODE trajectory in parameter space, using first-order feedback and a utility-aware schedule to balance retained knowledge and new task performance.
DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.
Representations learned by large AI models are converging toward a shared statistical model of reality.
citing papers explorer
-
Editing Models with Task Arithmetic
Task vectors from weight differences allow arithmetic operations to edit pre-trained models, improving multiple tasks simultaneously and enabling analogical inference on unseen tasks.