Title resolution pending

Self-Attention with Relative Position Representations , author= · 2018

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

RT-Transformer: The Transformer Block as a Spherical State Estimator

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.

Flex Attention: A Programming Model for Generating Optimized Attention Kernels

cs.LG · 2024-12-07 · unverdicted · novelty 6.0

FlexAttention supplies a compiler-driven interface that expresses common attention variants in a few lines of PyTorch and emits optimized kernels whose speed matches hand-written implementations.

Mesh Based Simulations with Spatial and Temporal awareness

cs.LG · 2026-05-02 · unverdicted · novelty 5.0

A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.

citing papers explorer

Showing 3 of 3 citing papers.

RT-Transformer: The Transformer Block as a Spherical State Estimator cs.LG · 2026-05-10 · unverdicted · none · ref 20
Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.
Flex Attention: A Programming Model for Generating Optimized Attention Kernels cs.LG · 2024-12-07 · unverdicted · none · ref 6
FlexAttention supplies a compiler-driven interface that expresses common attention variants in a few lines of PyTorch and emits optimized kernels whose speed matches hand-written implementations.
Mesh Based Simulations with Spatial and Temporal awareness cs.LG · 2026-05-02 · unverdicted · none · ref 174
A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer