Hardness of learning fixed parities with neural networks

Itamar Shoshani, Ohad Shamir , title = · 2025 · arXiv 2501.00817

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

other 1

citation-polarity summary

unclear 1

representative citing papers

Learning through Internalization

cs.LG · 2026-06-18 · unverdicted · novelty 7.0

A simplified one-layer transformer provably learns parities first with explicit CoT supervision then internalizes to direct computation as CoT tokens are removed.

Deep sequence models tend to memorize geometrically; it is unclear why

cs.LG · 2025-10-30 · unverdicted · novelty 6.0

Deep sequence models develop geometric memory in embeddings that encodes novel global relationships, transforming l-fold composition tasks into 1-step navigation via a natural spectral bias connected to Node2Vec.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Learning through Internalization cs.LG · 2026-06-18 · unverdicted · none · ref 8
A simplified one-layer transformer provably learns parities first with explicit CoT supervision then internalizes to direct computation as CoT tokens are removed.
Deep sequence models tend to memorize geometrically; it is unclear why cs.LG · 2025-10-30 · unverdicted · none · ref 165
Deep sequence models develop geometric memory in embeddings that encodes novel global relationships, transforming l-fold composition tasks into 1-step navigation via a natural spectral bias connected to Node2Vec.

Hardness of learning fixed parities with neural networks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer