Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

Christopher D. Manning; Kai Sheng Tai; Richard Socher

arxiv: 1503.00075 · v3 · pith:ZKNWXLARnew · submitted 2015-02-28 · 💻 cs.CL · cs.AI· cs.LG

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

Kai Sheng Tai , Richard Socher , Christopher D. Manning This is my paper

classification 💻 cs.CL cs.AIcs.LG

keywords lstmlongmemorynetworknetworkssemanticsentimentsequence

0 comments

read the original abstract

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On the Effectiveness of Code Representation in Deep Learning-Based Automated Patch Correctness Assessment
cs.SE 2026-03 unverdicted novelty 7.0

Graph-based code representations such as Code Property Graphs achieve the highest accuracy (average 82.6%) in predicting patch correctness across 15 benchmarks and outperform sequence and tree representations when use...
A Neural-based Program Decompiler
cs.PL 2019-06 unverdicted novelty 7.0

Coda is an end-to-end neural decompiler that recovers source code from binaries at 82% accuracy on unseen samples where conventional tools achieve 0%.
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
cs.CL 2020-02 unverdicted novelty 6.0

CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
Parallel Recursive LSTM
cs.LG 2026-05 unverdicted novelty 5.0

PR-LSTM replaces linear recurrence with recursive gated merging over a balanced binary tree to achieve log-depth parallelism without restricting transitions to linear or associative forms.
A Scalable Framework for Multilevel Streaming Data Analytics using Deep Learning
eess.SY 2019-07 unverdicted novelty 2.0

Describes a multilevel streaming text analytics framework combining Spark streaming, LSTM models, and SQL processing for real-time sentiment analysis demonstrated on a business use case.