hub

The journal of machine learning research , volume=

Dropout: a simple way to prevent neural networks from overfitting , author= · 2014

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

browse 11 citing papers

hub tools

JSON dossier citing papers JSON

representative citing papers

Progress measures for grokking via mechanistic interpretability

cs.LG · 2023-01-12 · accept · novelty 8.0

Grokking arises from gradual amplification of a Fourier-based circuit in the weights followed by removal of memorizing components.

Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.

Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surface Reconstruction

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

AmbiSuR adds intrinsic photometric disambiguation and a self-indication module to Gaussian Splatting to resolve ambiguities and improve surface reconstruction accuracy.

iGENE: A Differentiable Flux-Tube Gyrokinetic Code in TensorFlow

physics.plasm-ph · 2026-05-04 · unverdicted · novelty 7.0

A fully differentiable TensorFlow gyrokinetic code allows approximate gradients of nonlinear turbulence quantities to be used for outer-loop tasks such as profile prediction despite stochasticity.

HORST: Composing Optimizer Geometries for Sparse Transformer Training

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

HORST uses non-commutative operator composition and a hyperbolic mirror map to combine stability from adaptive optimizers with L1 sparsity bias, outperforming AdamW across sparsity levels on vision and language tasks.

GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

GCE-MIL is a backbone-agnostic wrapper that directly optimizes MIL evidence for sufficiency, necessity, and recoverability, yielding modest gains in Macro-F1 and C-index plus more faithful patch selection across many backbones and datasets.

Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

SP-KV trains a utility predictor jointly with the LLM to dynamically prune low-utility KV cache entries, achieving 3-10x memory reduction during generation with negligible performance loss.

PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting

cs.AI · 2026-05-09 · unverdicted · novelty 6.0 · 2 refs

PnP-Corrector decouples physics simulation from error correction via a plug-and-play agent, cutting error by 29% in 300-day global ocean-atmosphere forecasts.

A Cubing Strategy for Identifying Stable Hyperparameter Regions for Uncertainty Quantification in Spatial Deep Learning

stat.CO · 2026-05-15 · unverdicted · novelty 5.0

A recursive cubing framework identifies stable hyperparameter regions for MC dropout uncertainty quantification in spatial deep learning and produces competitive or superior predictive intervals versus a statistical baseline on simulations and land-surface temperature data.

Margin-Adaptive Confidence Ranking for Reliable LLM Judgement

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.

Adaptive Norm-Based Regularization for Neural Networks

stat.ML · 2026-04-30 · unverdicted · novelty 5.0

Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.

citing papers explorer

Showing 11 of 11 citing papers.

Progress measures for grokking via mechanistic interpretability cs.LG · 2023-01-12 · accept · none · ref 25
Grokking arises from gradual amplification of a Fourier-based circuit in the weights followed by removal of memorizing components.
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation cs.LG · 2026-05-18 · unverdicted · none · ref 87
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surface Reconstruction cs.CV · 2026-05-12 · unverdicted · none · ref 94
AmbiSuR adds intrinsic photometric disambiguation and a self-indication module to Gaussian Splatting to resolve ambiguities and improve surface reconstruction accuracy.
iGENE: A Differentiable Flux-Tube Gyrokinetic Code in TensorFlow physics.plasm-ph · 2026-05-04 · unverdicted · none · ref 60
A fully differentiable TensorFlow gyrokinetic code allows approximate gradients of nonlinear turbulence quantities to be used for outer-loop tasks such as profile prediction despite stochasticity.
HORST: Composing Optimizer Geometries for Sparse Transformer Training cs.LG · 2026-05-20 · unverdicted · none · ref 35
HORST uses non-commutative operator composition and a hyperbolic mirror map to combine stability from adaptive optimizers with L1 sparsity bias, outperforming AdamW across sparsity levels on vision and language tasks.
GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging cs.CV · 2026-05-17 · unverdicted · none · ref 16
GCE-MIL is a backbone-agnostic wrapper that directly optimizes MIL evidence for sufficiency, necessity, and recoverability, yielding modest gains in Macro-F1 and C-index plus more faithful patch selection across many backbones and datasets.
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility cs.LG · 2026-05-13 · unverdicted · none · ref 5
SP-KV trains a utility predictor jointly with the LLM to dynamically prune low-utility KV cache entries, achieving 3-10x memory reduction during generation with negligible performance loss.
PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting cs.AI · 2026-05-09 · unverdicted · none · ref 87 · 2 links
PnP-Corrector decouples physics simulation from error correction via a plug-and-play agent, cutting error by 29% in 300-day global ocean-atmosphere forecasts.
A Cubing Strategy for Identifying Stable Hyperparameter Regions for Uncertainty Quantification in Spatial Deep Learning stat.CO · 2026-05-15 · unverdicted · none · ref 89
A recursive cubing framework identifies stable hyperparameter regions for MC dropout uncertainty quantification in spatial deep learning and produces competitive or superior predictive intervals versus a statistical baseline on simulations and land-surface temperature data.
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement cs.LG · 2026-05-14 · unverdicted · none · ref 176
Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
Adaptive Norm-Based Regularization for Neural Networks stat.ML · 2026-04-30 · unverdicted · none · ref 40
Covariance-aware ridge and combined l1-l2 regularizers for neural networks yield better predictive performance and complexity control than standard penalties in simulations and applications to cooling-load prediction and leukemia classification.

The journal of machine learning research , volume=

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer