Structured Prediction Energy Networks

Andrew McCallum; David Belanger

arxiv: 1511.06350 · v3 · pith:RDNDAKLSnew · submitted 2015-11-19 · 💻 cs.LG · stat.ML

Structured Prediction Energy Networks

David Belanger , Andrew McCallum This is my paper

classification 💻 cs.LG stat.ML

keywords learningpredictionstructuredenergylabelsdeepfeaturesmulti-label

0 comments

read the original abstract

We introduce structured prediction energy networks (SPENs), a flexible framework for structured prediction. A deep architecture is used to define an energy function of candidate labels, and then predictions are produced by using back-propagation to iteratively optimize the energy with respect to the labels. This deep architecture captures dependencies between labels that would lead to intractable graphical models, and performs structure learning by automatically learning discriminative features of the structured output. One natural application of our technique is multi-label classification, which traditionally has required strict prior assumptions about the interactions between labels to ensure tractable learning and prediction. We are able to apply SPENs to multi-label problems with substantially larger label sets than previous applications of structured prediction, while modeling high-order interactions using minimal structural assumptions. Overall, deep learning provides remarkable tools for learning features of the inputs to a prediction problem, and this work extends these techniques to learning features of structured outputs. Our experiments provide impressive performance on a variety of benchmark multi-label classification tasks, demonstrate that our technique can be used to provide interpretable structure learning, and illuminate fundamental trade-offs between feed-forward and iterative structured prediction.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction
cs.CV 2026-05 unverdicted novelty 6.0

SAVER proposes a conformal groundability gate plus submodular image selector that activates vision only when needed for multimodal named entity recognition and relation extraction, improving F1 while lowering compute.