pith. sign in

arxiv: 1602.03609 · v1 · pith:4STHICGCnew · submitted 2016-02-11 · 💻 cs.CL · cs.LG

Attentive Pooling Networks

classification 💻 cs.CL cs.LG
keywords networkspoolingattentioninputneuralattentivemechanismpair
0
0 comments X
read the original abstract

In this work, we propose Attentive Pooling (AP), a two-way attention mechanism for discriminative model training. In the context of pair-wise ranking or classification with neural networks, AP enables the pooling layer to be aware of the current input pair, in a way that information from the two input items can directly influence the computation of each other's representations. Along with such representations of the paired inputs, AP jointly learns a similarity measure over projected segments (e.g. trigrams) of the pair, and subsequently, derives the corresponding attention vector for each input to guide the pooling. Our two-way attention mechanism is a general framework independent of the underlying representation learning, and it has been applied to both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in our studies. The empirical results, from three very different benchmark tasks of question answering/answer selection, demonstrate that our proposed models outperform a variety of strong baselines and achieve state-of-the-art performance in all the benchmarks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Systematic Evaluation of Molecular Mixture Behavior Prediction

    cs.LG 2026-05 unverdicted novelty 7.0

    Strong absolute accuracy on mixture properties often masks poor recovery of non-ideal behavior, with large drops under strict molecule splits, making transfer to unseen molecules the central challenge.

  2. Explainable AI in Speaker Recognition -- Attention Map Visualisation and Evaluation

    eess.AS 2026-06 unverdicted novelty 5.0

    The paper introduces Modified RISE-eval to evaluate GradCAM and LayerCAM attention maps on speaker recognition networks and reports distinct advantages for each method under different conditions.