NEST is a nested transformer for sequences of multisets that uses masked set modeling to learn improved set-level representations from hierarchical event streams like EHRs.
An exploration of hierarchical attention transformers for efficient long document classification
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A two-level overlapping Schwarz domain decomposition constructs a hierarchical attention operator that trains faster and approximates the inverse of a discretized 1D diffusion operator more accurately than global low-rank attention while using fewer parameters.
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
citing papers explorer
-
NEST: Nested Event Stream Transformer for Sequences of Multisets
NEST is a nested transformer for sequences of multisets that uses masked set modeling to learn improved set-level representations from hierarchical event streams like EHRs.
-
Hierarchical Attention via Domain Decomposition
A two-level overlapping Schwarz domain decomposition constructs a hierarchical attention operator that trains faster and approximates the inverse of a discretized 1D diffusion operator more accurately than global low-rank attention while using fewer parameters.