pith. sign in

hub

A Structured Self-attentive Sentence Embedding

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it
abstract

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.

hub tools

citation-role summary

background 1 other 1

citation-polarity summary

polarities

unclear 2

clear filters

representative citing papers

Emergent Culture in Minimal LLM Systems

cs.NE · 2026-06-21 · unverdicted · novelty 7.0

Minimal collectives of three LLM agents develop spontaneous cooperation, storage strategies, and complex evolving cultural artifacts via interaction with a decaying shared text store and evolutionary pressure.

Graph Attention Networks

stat.ML · 2017-10-30 · accept · novelty 7.0

Graph Attention Networks compute learnable attention coefficients over node neighborhoods to produce weighted feature aggregations, achieving state-of-the-art results on citation networks and inductive protein-protein interaction graphs.

Cognitive State Inference from VR Motion via Motion Foundation Model

cs.HC · 2025-09-29 · unverdicted · novelty 6.0

VR head and hand motion data can be adapted to motion foundation models to classify cognitive states like confusion and hesitation at 82% accuracy with better cross-user generalization than baseline models on a new 24-participant dataset.

Universal Transformers

cs.CL · 2018-07-10 · unverdicted · novelty 6.0

Universal Transformers combine Transformer parallelism with recurrent updates and dynamic halting to achieve Turing-completeness under assumptions and outperform standard Transformers on algorithmic and language tasks.

Attention Is All You Need

cs.CL · 2017-06-12 · unverdicted · novelty 5.0

Pith review generated a malformed one-line summary.

citing papers explorer

Showing 12 of 12 citing papers.