Padded transformers with constant or growing precision are equivalent to L-uniform AC^0 or TC^0, and with looping reach FO-uniform AC^d or TC^d, robust to width and attention mechanism.
Descriptive Complexity
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
representative citing papers
NRPs extend Datalog with embedding operations to create a single formalism readable as both query plans with trainable parts and neural architectures with relational structure.
citing papers explorer
-
Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't
Padded transformers with constant or growing precision are equivalent to L-uniform AC^0 or TC^0, and with looping reach FO-uniform AC^d or TC^d, robust to width and attention mechanism.
-
Neuro-Relational Programs: Unifying Queries and Neural Computation over Structured Data
NRPs extend Datalog with embedding operations to create a single formalism readable as both query plans with trainable parts and neural architectures with relational structure.
- The $\mathsf{AC}^0$-Complexity Of Visibly Pushdown Languages
- Work-Efficient Query Evaluation in Constant Time with PRAMs