Infinite-width transformers exhibit an inductive bias against high-complexity polynomial-time algorithms, with derived upper bounds on capturable tasks like sorting and string matching.
cc/paper_files/paper/2022/file/ 884baf65392170763b27c914087bde01-Paper-Conference
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3representative citing papers
KAN-CL cuts catastrophic forgetting by 88-93% on Split-CIFAR-10/5T and Split-CIFAR-100/10T by anchoring KAN parameters at per-knot granularity while matching baseline accuracy.
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.
citing papers explorer
-
Algorithmic Task Capture, Computational Complexity, and Inductive Bias of Infinite Transformers
Infinite-width transformers exhibit an inductive bias against high-complexity polynomial-time algorithms, with derived upper bounds on capturable tasks like sorting and string matching.
-
KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks
KAN-CL cuts catastrophic forgetting by 88-93% on Split-CIFAR-10/5T and Split-CIFAR-100/10T by anchoring KAN parameters at per-knot granularity while matching baseline accuracy.
-
Characterizing and Correcting Effective Target Shift in Online Learning
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.