super hub Mixed citations

Gradient-based learning applied to document recognition

L. Bottou, P. Haffner, Y. Bengio, Y. Lecun · 1998 · Proceedings of the IEEE · DOI 10.1109/5.726791

Mixed citation behavior. Most common role is background (43%).

56 Pith papers citing it

44.7k external citations · Crossref

Background 43% of classified citations

open at publisher browse 56 citing papers more from L. Bottou

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 7 dataset 4 method 3

citation-polarity summary

background 6 use dataset 4 use method 3 support 1

authors

L. Bottou P. Haffner Y. Bengio Y. Lecun

co-cited works

representative citing papers

STRABLE: Benchmarking Tabular Machine Learning with Strings

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.

Adaptive multi-line fitting for stable line-core intensity and Doppler velocity

astro-ph.SR · 2026-05-20 · conditional · novelty 7.0

LineFit delivers more stable line-core intensity and Doppler velocity time series from complex multi-line solar spectra by combining adaptive windowing, asymmetric Voigt options, and split-core handling, outperforming standard fast estimators on synthetic benchmarks.

Stress-Testing Neural Network Verifiers with Provably Robust Instances

cs.LG · 2026-05-16 · conditional · novelty 7.0

A reusable framework generates verification instances with provably known robustness labels, revealing numeric tolerance issues and bugs in five verifiers while introducing difficulty profiles to diagnose failure modes.

Quantitative Linear Logic for Neuro-Symbolic Learning and Verification

cs.LO · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.

Estimating Implicit Regularization in Deep Learning

stat.ML · 2026-05-06 · unverdicted · novelty 7.0

Gradient matching empirically recovers implicit regularization effects such as l2 penalties from early stopping and dropout in neural networks.

On the Architectural Complexity of Neural Networks

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

A framework quantifies DNN complexity via tensor operations, links 40 years of breakthroughs to complexity increases, and releases a dataset of 3000+ unexplored high-complexity architectures.

Beyond ECE: Calibrated Size Ratio, Risk Assessment, and Confidence-Weighted Metrics

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

Introduces Calibrated Size Ratio (CSR) and confidence-weighted metrics to better detect overconfidence risk and calibration issues beyond the limitations of ECE.

BRIDGE and TCH-Net: Heterogeneous Benchmark and Multi-Branch Baseline for Cross-Domain IoT Botnet Detection

cs.CR · 2026-04-13 · unverdicted · novelty 7.0

BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.

Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

SecurePix uses FeFET multidomain polarization states for in-pixel symmetric-key encryption, dropping ResNet-18 accuracy to 9.58% on MNIST and 6.98% on CIFAR-10 while supporting key-based decryption via lookup table.

Dynamic Free-Rider Detection in Federated Learning via Simulated Attack Patterns

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

S2-WEF detects dynamic free-riders in federated learning by simulating attack WEF patterns from prior global models, combining them with mutual deviation scores, and using two-dimensional clustering without proxy data or pre-training.

Multi-Mode Quantum Annealing for Generative Representation Learning with Boltzmann Priors

quant-ph · 2026-04-01 · unverdicted · novelty 7.0

A multi-mode quantum annealing approach enables VAEs with Boltzmann priors, showing faster training and better generation than Gaussian-prior VAEs on MNIST, Fashion-MNIST, and CelebA plus improved out-of-distribution detection.

Selectivity and Shape in the Design of Forward-Forward Goodness Functions

cs.LG · 2026-03-28 · unverdicted · novelty 7.0

Shape- and peak-sensitive goodness functions for Forward-Forward deliver up to 72pp gains over sum-of-squares, reaching 98.2% on MNIST and 89% on Fashion-MNIST.

Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy

cs.LG · 2026-03-06 · unverdicted · novelty 7.0

Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.

Programmable superconducting neuron with intrinsic in-memory computation and dual-timescale plasticity for ultra-efficient neuromorphic computing

cs.ET · 2026-03-05 · unverdicted · novelty 7.0

A programmable superconducting LIF neuron with intrinsic static memory and dual-timescale plasticity achieves 45 GHz operation and femtojoule energy per spike.

Task complexity shapes internal representations and robustness in neural networks

cs.LG · 2025-08-07 · unverdicted · novelty 7.0

Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.

Encrypted Neural Networks without Overflows

cs.CR · 2026-05-21 · unverdicted · novelty 6.0

Introduces formal verification to compute certified neuron range bounds for CKKS-encrypted neural networks, eliminating overflow failures that previously reached 47%.

Expectation Consistency Loss: Rethink Confidence Calibration under Covariate Shift

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

Derives expectation consistency condition as necessary and sufficient for calibration under covariate shift and proposes ECL loss with matching sample complexity to ECE.

Generative Recursive Reasoning

cs.AI · 2026-05-19 · unverdicted · novelty 6.0 · 2 refs

GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.

The Diffusion Encoder

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

A diffusion model serves as the encoder in an autoencoder when trained alternately with the decoder to resolve opposing update directions while retaining the standard diffusion training objective.

From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.

Instructions Shape Production of Language, not Processing

cs.CL · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.

Inducing Spatial Locality in Vision Transformers through the Training Protocol

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.

What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.

Flow Matching with Arbitrary Auxiliary Paths

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

AuxPath-FM extends flow matching to arbitrary auxiliary distributions while preserving the continuity equation and marginal training objective.

citing papers explorer

Showing 50 of 56 citing papers.

STRABLE: Benchmarking Tabular Machine Learning with Strings cs.LG · 2026-05-12 · unverdicted · none · ref 38
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
Adaptive multi-line fitting for stable line-core intensity and Doppler velocity astro-ph.SR · 2026-05-20 · conditional · none · ref 39
LineFit delivers more stable line-core intensity and Doppler velocity time series from complex multi-line solar spectra by combining adaptive windowing, asymmetric Voigt options, and split-core handling, outperforming standard fast estimators on synthetic benchmarks.
Stress-Testing Neural Network Verifiers with Provably Robust Instances cs.LG · 2026-05-16 · conditional · none · ref 15
A reusable framework generates verification instances with provably known robustness labels, revealing numeric tolerance issues and bugs in five verifiers while introducing difficulty profiles to diagnose failure modes.
Quantitative Linear Logic for Neuro-Symbolic Learning and Verification cs.LO · 2026-05-13 · unverdicted · none · ref 80 · 2 links
QLL is a novel logic for neuro-symbolic learning that uses ML-native operations (sum, log-sum-exp) on logits to embed constraints, satisfying most linear logic properties and showing stronger correlation between empirical robustness and formal verification than prior approaches.
Estimating Implicit Regularization in Deep Learning stat.ML · 2026-05-06 · unverdicted · none · ref 21
Gradient matching empirically recovers implicit regularization effects such as l2 penalties from early stopping and dropout in neural networks.
On the Architectural Complexity of Neural Networks cs.LG · 2026-05-05 · unverdicted · none · ref 26
A framework quantifies DNN complexity via tensor operations, links 40 years of breakthroughs to complexity increases, and releases a dataset of 3000+ unexplored high-complexity architectures.
Beyond ECE: Calibrated Size Ratio, Risk Assessment, and Confidence-Weighted Metrics cs.LG · 2026-05-03 · unverdicted · none · ref 20
Introduces Calibrated Size Ratio (CSR) and confidence-weighted metrics to better detect overconfidence risk and calibration issues beyond the limitations of ECE.
BRIDGE and TCH-Net: Heterogeneous Benchmark and Multi-Branch Baseline for Cross-Domain IoT Botnet Detection cs.CR · 2026-04-13 · unverdicted · none · ref 13
BRIDGE creates the first formal heterogeneous multi-dataset benchmark for IoT botnet detection with LODO evaluation, and TCH-Net achieves mean LODO F1 of 0.5577 while reaching F1 0.8296 on standard tests, outperforming twelve baselines.
Lightweight True In-Pixel Encryption with FeFET Enabled Pixel Design for Secure Imaging cs.CV · 2026-04-06 · unverdicted · none · ref 32
SecurePix uses FeFET multidomain polarization states for in-pixel symmetric-key encryption, dropping ResNet-18 accuracy to 9.58% on MNIST and 6.98% on CIFAR-10 while supporting key-based decryption via lookup table.
Dynamic Free-Rider Detection in Federated Learning via Simulated Attack Patterns cs.LG · 2026-04-06 · unverdicted · none · ref 13
S2-WEF detects dynamic free-riders in federated learning by simulating attack WEF patterns from prior global models, combining them with mutual deviation scores, and using two-dimensional clustering without proxy data or pre-training.
Multi-Mode Quantum Annealing for Generative Representation Learning with Boltzmann Priors quant-ph · 2026-04-01 · unverdicted · none · ref 25
A multi-mode quantum annealing approach enables VAEs with Boltzmann priors, showing faster training and better generation than Gaussian-prior VAEs on MNIST, Fashion-MNIST, and CelebA plus improved out-of-distribution detection.
Selectivity and Shape in the Design of Forward-Forward Goodness Functions cs.LG · 2026-03-28 · unverdicted · none · ref 10
Shape- and peak-sensitive goodness functions for Forward-Forward deliver up to 72pp gains over sum-of-squares, reaching 98.2% on MNIST and 89% on Fashion-MNIST.
Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy cs.LG · 2026-03-06 · unverdicted · none · ref 27
Langevin sampling on the modern Hopfield energy produces training-free stochastic attention that transitions from exact retrieval to generation as temperature rises, with an entropy inflection condition marking the shift.
Programmable superconducting neuron with intrinsic in-memory computation and dual-timescale plasticity for ultra-efficient neuromorphic computing cs.ET · 2026-03-05 · unverdicted · none · ref 35
A programmable superconducting LIF neuron with intrinsic static memory and dual-timescale plasticity achieves 45 GHz operation and femtojoule energy per spike.
Task complexity shapes internal representations and robustness in neural networks cs.LG · 2025-08-07 · unverdicted · none · ref 34
Harder classification tasks produce neural representations whose accuracy collapses under binarization and shuffling while easier tasks remain robust, defining task complexity via the performance gap between full-precision and perturbed networks.
Encrypted Neural Networks without Overflows cs.CR · 2026-05-21 · unverdicted · none · ref 77
Introduces formal verification to compute certified neuron range bounds for CKKS-encrypted neural networks, eliminating overflow failures that previously reached 47%.
Expectation Consistency Loss: Rethink Confidence Calibration under Covariate Shift cs.LG · 2026-05-20 · unverdicted · none · ref 7
Derives expectation consistency condition as necessary and sufficient for calibration under covariate shift and proposes ECL loss with matching sample complexity to ECE.
Generative Recursive Reasoning cs.AI · 2026-05-19 · unverdicted · none · ref 15 · 2 links
GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.
The Diffusion Encoder cs.LG · 2026-05-13 · unverdicted · none · ref 32
A diffusion model serves as the encoder in an autoencoder when trained alternately with the decoder to resolve opposing update directions while retaining the standard diffusion training objective.
From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP cs.AI · 2026-05-12 · unverdicted · none · ref 9
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
Instructions Shape Production of Language, not Processing cs.CL · 2026-05-11 · unverdicted · none · ref 175 · 2 links
Instructions trigger a production-centered mechanism in language models, with task-specific information stable in input tokens but varying strongly in output tokens and correlating with behavior.
Inducing Spatial Locality in Vision Transformers through the Training Protocol cs.CV · 2026-05-11 · unverdicted · none · ref 7
CutMix augmentation during training induces spatial locality in early layers of Vision Transformers trained from scratch, as measured by reduced Mean Attention Distance.
What If We Let Forecasting Forget? A Sparse Bottleneck for Cross-Variable Dependencies cs.LG · 2026-05-08 · unverdicted · none · ref 89
MS-FLOW uses a capacity-limited sparse routing mechanism to model only critical inter-variable dependencies in time series data, achieving state-of-the-art accuracy on 12 benchmarks with fewer but more reliable connections.
Flow Matching with Arbitrary Auxiliary Paths cs.LG · 2026-05-07 · unverdicted · none · ref 27
AuxPath-FM extends flow matching to arbitrary auxiliary distributions while preserving the continuity equation and marginal training objective.
P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference cs.AI · 2026-05-07 · unverdicted · none · ref 21
P-Guide achieves single-pass classifier-free guidance in flow matching by modulating the initial latent state and is equivalent to standard CFG under a first-order approximation while cutting latency by half.
When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge cs.DL · 2026-05-07 · unverdicted · none · ref 31 · 3 links
AI use in science has grown exponentially since 2015 but stays confined to computer science and statistics topics, shows higher retraction rates and citations, and follows distinct global adoption patterns.
Calculating Domain of Attraction Boundary of Power Systems Based on the Gentlest Ascent Dynamics math.DS · 2026-05-05 · unverdicted · none · ref 48
Applies gentlest ascent dynamics and stable manifold methods to compute domain of attraction boundaries for stable equilibria in synchronous-generator power system models.
Class Angular Distortion Index for Dimensionality Reduction cs.LG · 2026-05-01 · unverdicted · none · ref 29
CADI quantifies the preservation of relative cluster angles in low-dimensional projections using internal angles from point triples.
Empirical Insights of Test Selection Metrics under Multiple Testing Objectives and Distribution Shifts cs.SE · 2026-04-25 · unverdicted · none · ref 42
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
Modulation Feature Enhancement with a Multi-Stage Attention Network for Underwater Acoustic Target Recognition eess.SP · 2026-04-24 · unverdicted · none · ref 4 · 2 links
A 1-D CNN with novel multi-stage spectral attention mechanisms and adjustable class-balanced focal loss improves recognition accuracy on real ship-radiated noise datasets.
LTBs-KAN: Linear-Time B-splines Kolmogorov-Arnold Networks cs.LG · 2026-04-23 · unverdicted · none · ref 8
LTBs-KAN delivers linear-time B-spline evaluation in KANs plus parameter reduction via product-of-sums factorization, with competitive results on MNIST, Fashion-MNIST, and CIFAR-10.
QuanForge: A Mutation Testing Framework for Quantum Neural Networks cs.SE · 2026-04-22 · unverdicted · none · ref 29
QuanForge introduces statistical mutation killing and nine post-training mutation operators for QNNs to distinguish test suites and localize vulnerable circuit regions.
Efficient Adversarial Training via Criticality-Aware Fine-Tuning cs.CV · 2026-04-14 · unverdicted · none · ref 22
CAAT selects critical parameters for adversarial robustness in ViTs and applies PEFT to tune only those, yielding a 4.3% robustness drop versus full AT while using ~6% of parameters.
Daily Predictions of F10.7 and F30 Solar Indices with Deep Learning astro-ph.SR · 2026-04-11 · unverdicted · none · ref 16
SINet outperforms five prior statistical and deep learning methods on F10.7 predictions and provides the first deep learning forecasts for the F30 solar index.
Extraction of linearized models from pre-trained networks via knowledge distillation cs.LG · 2026-04-08 · unverdicted · none · ref 33
Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.
Drifting Fields are not Conservative cs.LG · 2026-04-07 · unverdicted · none · ref 6 · 2 links
Drift fields are not conservative except for Gaussian kernels; sharp normalization makes them conservative for any radial kernel by equating them to score differences of kernel density estimates.
ML-based approach to classification and generation of structured light propagation in turbulent media physics.optics · 2026-04-04 · unverdicted · none · ref 25
ML models classify and generate structured light in turbulence using CNNs and diffusion models enhanced by Bregman distance minimization.
Deep Image Clustering Based on Curriculum Learning and Density Information cs.CV · 2026-03-31 · unverdicted · none · ref 32
IDCL adds density-based curriculum learning and density-core guidance to deep image clustering, claiming superior robustness, faster convergence, and flexibility on benchmark datasets.
Realistic Handwritten Multi-Digit Writer (MDW) Number Recognition Challenges cs.CV · 2025-11-30 · unverdicted · none · ref 9
New MDW benchmarks demonstrate that isolated digit classifiers struggle with multi-digit numbers from the same writer, necessitating task-specific metrics and advanced methods.
Pulse Shape Discrimination Algorithms: Survey and Benchmark cs.LG · 2025-08-03 · conditional · none · ref 68
A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.
Distributed Normal Map-based Stochastic Proximal Gradient Methods over Networks math.OC · 2024-12-17 · unverdicted · none · ref 25
norM-DSGT and norM-ED achieve centralized stochastic proximal-gradient rates for distributed composite objectives, with norM-ED transient time O(n^3/(1-λ)^2).
Representation Gap: Explaining the Unreasonable Effectiveness of Neural Networks from a Geometric Perspective cs.LG · 2026-05-20 · unverdicted · none · ref 10
Derives an asymptotic equivalent for the Representation Gap in equivariant diffusion models, showing it depends primarily on the intrinsic dimension of the task.
Unveiling Hidden Lyman Alpha Emitters in the DESI DR1 Data astro-ph.GA · 2026-05-12 · unverdicted · none · ref 42
A CNN detects 19,685 LAEs at z=2-3.5 in DESI DR1 spectra with 95% purity and completeness.
Automated Classification of Plasma Regions at Mars Using Machine Learning physics.space-ph · 2026-04-18 · unverdicted · none · ref 16
A convolutional neural network trained on MAVEN SWIA ion spectra reliably classifies solar wind, magnetosheath, and induced magnetosphere regions at Mars, outperforming a multilayer perceptron.
ASTRAFier: A Novel and Scalable Transformer-based Stellar Variability Classifier astro-ph.IM · 2026-04-08 · unverdicted · none · ref 59
ASTRAFier is a Transformer-BiLSTM-CNN model that classifies stellar variability from light curves, reporting 94.26% accuracy on Kepler data and 88.22% on TESS, then applied to 2.8 million TESS curves to release a catalog.
DistributedEstimator: Distributed Training of Quantum Neural Networks via Circuit Cutting cs.DC · 2026-02-18 · conditional · none · ref 46
DistributedEstimator demonstrates that circuit cutting preserves test accuracy and robustness in QNN training on Iris and MNIST while revealing that classical reconstruction dominates runtime and exponential subcircuit growth limits scaling.
Agglomerative Attention cs.LG · 2019-07-15 · unverdicted · none · ref 9
Presents agglomerative attention, a linear-complexity attention model that achieves comparable performance to full attention on language modeling tasks.
Joint sparse coding and temporal dynamics support context reconfiguration q-bio.NC · 2026-05-11 · unverdicted · none · ref 13
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.
Multi-Dataset Cross-Domain Knowledge Distillation for Unified Medical Image Segmentation, Classification, and Detection cs.CV · 2026-05-02 · unverdicted · none · ref 46
A multi-dataset cross-domain knowledge distillation approach improves unified performance on medical image segmentation, classification, and detection by transferring domain-invariant features from a joint teacher model to task-specific students.
Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors stat.ML · 2026-04-23 · unverdicted · none · ref 35
Augmenting zone-level MTPL claim frequency models with coordinates, environmental features at 5 km scale, and image embeddings improves predictive accuracy on unseen postcodes across GLM, regularized GLM, and tree-based models.

Gradient-based learning applied to document recognition

hub tools

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer