Dropout: a simple way to prevent neural networks from overﬁtting.The Journal of Machine Learning Research, 15(1):1929–1958, 2014

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov · 1929

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

MIPaaL: Mixed Integer Program as a Layer

cs.LG · 2019-07-12 · unverdicted · novelty 8.0

MIPaaL differentiates through mixed integer programs via cutting planes to enable decision-focused learning for general MIPs, outperforming two-stage prediction-plus-optimization and LP-relaxation baselines on real-world domains.

Augmenting Self-attention with Persistent Memory

cs.LG · 2019-07-02 · unverdicted · novelty 7.0

Augmenting self-attention with persistent memory vectors allows removal of feed-forward layers from Transformers without degrading performance on character and word level language modeling benchmarks.

Generalizing from a few environments in safety-critical reinforcement learning

cs.LG · 2019-07-02 · unverdicted · novelty 6.0

RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.

Confidence Calibration for Convolutional Neural Networks Using Structured Dropout

cs.LG · 2019-06-23 · unverdicted · novelty 5.0

Structured dropout improves confidence calibration in CNNs by promoting ensemble diversity, with empirical support on SVHN, CIFAR-10, CIFAR-100 and in Bayesian active learning.

citing papers explorer

Showing 4 of 4 citing papers.

MIPaaL: Mixed Integer Program as a Layer cs.LG · 2019-07-12 · unverdicted · none · ref 42
MIPaaL differentiates through mixed integer programs via cutting planes to enable decision-focused learning for general MIPs, outperforming two-stage prediction-plus-optimization and LP-relaxation baselines on real-world domains.
Augmenting Self-attention with Persistent Memory cs.LG · 2019-07-02 · unverdicted · none · ref 37
Augmenting self-attention with persistent memory vectors allows removal of feed-forward layers from Transformers without degrading performance on character and word level language modeling benchmarks.
Generalizing from a few environments in safety-critical reinforcement learning cs.LG · 2019-07-02 · unverdicted · none · ref 33
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
Confidence Calibration for Convolutional Neural Networks Using Structured Dropout cs.LG · 2019-06-23 · unverdicted · none · ref 36
Structured dropout improves confidence calibration in CNNs by promoting ensemble diversity, with empirical support on SVHN, CIFAR-10, CIFAR-100 and in Bayesian active learning.

Dropout: a simple way to prevent neural networks from overﬁtting.The Journal of Machine Learning Research, 15(1):1929–1958, 2014

fields

years

verdicts

representative citing papers

citing papers explorer