pith. machine review for the scientific record. sign in

arxiv: 1805.09786 · v1 · submitted 2018-05-24 · 💻 cs.NE

Recognition: unknown

Hyperbolic Attention Networks

Authors on Pith no claims yet
classification 💻 cs.NE
keywords networkshyperbolicneuralattentiongeometryimposingtermsachieve
0
0 comments X
read the original abstract

We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks. This allows us to exploit hyperbolic geometry to reason about embeddings produced by deep networks. We achieve this by re-expressing the ubiquitous mechanism of soft attention in terms of operations defined for hyperboloid and Klein models. Our method shows improvements in terms of generalization on neural machine translation, learning on graphs and visual question answering tasks while keeping the neural representations compact.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. New non-Euclidean neural quantum states from additional types of hyperbolic recurrent neural networks

    quant-ph 2026-04 unverdicted novelty 7.0

    Hyperbolic RNN and GRU neural quantum states outperform Euclidean versions on Heisenberg J1J2 and J1J2J3 models with 100 spins.

  2. Rates of forgetting for the sequentially Markov coalescent

    math.PR 2026-04 unverdicted novelty 7.0

    SMC forgets its initial condition geometrically in the jump chain and as 1/ℓ in continuous genetic distance, justifying independent-locus approximations.

  3. HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

    cs.AI 2026-04 unverdicted novelty 6.0

    HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

  4. Attention-based graph neural networks: a survey

    cs.SI 2026-05 unverdicted novelty 5.0

    The survey groups attention-based GNNs into three stages—graph recurrent attention networks, graph attention networks, and graph transformers—while reviewing architectures and future directions.