pith. sign in

arxiv: 1506.01066 · v2 · pith:ASQZFOF2new · submitted 2015-06-02 · 💻 cs.CL

Visualizing and Understanding Neural Models in NLP

classification 💻 cs.CL
keywords compositionalitymethodsmodelsneuralsimplevisualizinglstmsmeaning
0
0 comments X
read the original abstract

While neural networks have been successfully applied to many NLP tasks the resulting vector-based models are very difficult to interpret. For example it's not clear how they achieve {\em compositionality}, building sentence meaning from the meanings of words and phrases. In this paper we describe four strategies for visualizing compositionality in neural models for NLP, inspired by similar work in computer vision. We first plot unit values to visualize compositionality of negation, intensification, and concessive clauses, allow us to see well-known markedness asymmetries in negation. We then introduce three simple and straightforward methods for visualizing a unit's {\em salience}, the amount it contributes to the final composed meaning: (1) gradient back-propagation, (2) the variance of a token from the average word node, (3) LSTM-style gates that measure information flow. We test our methods on sentiment using simple recurrent nets and LSTMs. Our general-purpose methods may have wide applications for understanding compositionality and other semantic properties of deep networks , and also shed light on why LSTMs outperform simple recurrent nets,

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

    cs.CL 2023-05 conditional novelty 8.0

    Tiny language models under 10M parameters trained on a synthetic children's story dataset generate fluent, consistent, multi-paragraph English text with near-perfect grammar and reasoning.

  2. Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

    cs.HC 2019-07 unverdicted novelty 5.0

    Proposes the CSI framework for co-designing visual interactions and deep learning models to expose and allow semantic control over intermediate reasoning processes, shown in a summarization case study.

  3. Automatically Learning Construction Injury Precursors from Text

    cs.CL 2019-07 unverdicted novelty 4.0

    Standard NLP classifiers can surface valid injury precursors from raw construction safety reports.