Canonical reference

Deep learning,

Geoffrey Hinton, Yann LeCun, Yoshua Bengio · 2015 · Nature · DOI 10.1038/nature14539 · arXiv gov/2601744

Canonical reference. 100% of citing Pith papers cite this work as background.

52 Pith papers citing it

71.9k external citations · Crossref

Background 100% of classified citations

open at publisher browse 52 citing papers more from Geoffrey Hinton arXiv PDF

citation-role summary

background 13 method 1

citation-polarity summary

background 14

authors

Geoffrey Hinton Yann LeCun Yoshua Bengio

co-cited works

representative citing papers

Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

cs.LG · 2026-05-19 · unverdicted · novelty 7.0

Symmetrization of multi-class losses produces a unique convex symmetric loss that locally approximates others and supports robust neural training under label noise.

Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.

DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.

Broximal Alignment for Global Non-Convex Optimization

math.OC · 2026-04-15 · unverdicted · novelty 7.0

Broximal Alignment is a novel condition under which the Ball Proximal Point Method converges to global minima in non-convex settings, generalizing quasiconvexity, star convexity, and related frameworks.

On the Decompositionality of Neural Networks

cs.LO · 2026-04-09 · unverdicted · novelty 7.0

Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.

Accelerating Inference for Multilayer Neural Networks with Quantum Computers

quant-ph · 2025-10-08 · unverdicted · novelty 7.0

Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.

Non-markovian neural quantum propagator and its application to the simulation of ultrafast nonlinear spectra

physics.chem-ph · 2024-08-01 · unverdicted · novelty 7.0

A machine learning model called neural quantum propagator is introduced to efficiently solve non-Markovian quantum dynamics described by HEOM and applied to simulate spectra of the FMO complex.

Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis

eess.SP · 2026-05-16 · unverdicted · novelty 6.0

Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.

Extracting redshifts from 2D slitless spectroscopic images using deep learning for the CSST galaxy survey

astro-ph.IM · 2026-05-16 · unverdicted · novelty 6.0

A Bayesian CNN maps 2D slitless spectral images to redshift estimates with NMAD precision 0.0104 for SNR_GI >=1 and better for brighter sources, while remaining robust to wavelength calibration errors via spatial augmentations.

Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality

cs.CV · 2026-05-11 · accept · novelty 6.0

Scaling vision models by depth and parameter count does not consistently improve localisation-based explanation quality across architectures, datasets, and post-hoc methods; smaller models often perform comparably or better.

Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping

cs.CV · 2026-05-07 · conditional · novelty 6.0

Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.

A Meta Reinforcement Learning Approach to Goals-Based Wealth Management

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.

Lottery BP: Unlocking Quantum Error Decoding at Scale

cs.AR · 2026-04-28 · unverdicted · novelty 6.0

Lottery BP adds randomness to belief propagation decoding and uses syndrome voting to achieve far higher accuracy on topological quantum codes while reducing reliance on expensive global decoders.

Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images

cs.CV · 2026-04-27 · unverdicted · novelty 6.0

TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.

Mistake gating leads to energy and memory efficient continual learning

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

Mistake-gated plasticity reduces neural network updates by 50-80% by gating changes on classification errors, improving efficiency for continual learning without added hyperparameters.

Extraction of linearized models from pre-trained networks via knowledge distillation

cs.LG · 2026-04-08 · unverdicted · novelty 6.0

Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.

Neural Networks With Dense Weights Are Not Universal Approximators

cs.LG · 2026-02-07 · unverdicted · novelty 6.0 · 2 refs

Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.

iPDB -- Optimizing Semantic SQL Queries

cs.DB · 2026-01-23 · unverdicted · novelty 6.0

iPDB adds a predict operator and semantic query optimizations to SQL so that LLM and ML calls run efficiently inside the database, delivering 2.5x average and up to 30x speedup over prior systems.

Fusion or Confusion? Multimodal Complexity Is Not All You Need

cs.LG · 2025-12-28 · unverdicted · novelty 6.0

Complex multimodal architectures do not reliably outperform unimodal baselines or a simple multimodal baseline under standardized evaluation.

HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

cs.LG · 2025-12-08 · unverdicted · novelty 6.0

HOLE applies persistent homology to latent embeddings in neural networks and uses visualizations such as cluster flow diagrams to reveal patterns of class separation, feature disentanglement, and robustness.

Pulse Shape Discrimination Algorithms: Survey and Benchmark

cs.LG · 2025-08-03 · conditional · novelty 6.0

A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.

Relating Simple Sentence Representations in Deep Neural Networks and the Brain

cs.CL · 2019-06-27 · unverdicted · novelty 6.0

BERT activations show strongest correlation with MEG data for simple sentences; DNN representations generate synthetic brain data that improves stimuli decoding accuracy.

A Simulation Methodology Testbed for Typhoon Sensitivity Analysis: Framework Development and Perturbation-Response Experiments with the Pangu Weather Model

physics.ao-ph · 2026-05-21 · unverdicted · novelty 5.0

A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.

citing papers explorer

Showing 50 of 52 citing papers.

Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations cs.LG · 2026-05-20 · unverdicted · none · ref 191
CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.
Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels cs.LG · 2026-05-19 · unverdicted · none · ref 113
Symmetrization of multi-class losses produces a unique convex symmetric loss that locally approximates others and supports robust neural training under label noise.
Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space cs.LG · 2026-05-18 · unverdicted · none · ref 45
In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.
DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion cs.LG · 2026-05-06 · unverdicted · none · ref 15
DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.
Broximal Alignment for Global Non-Convex Optimization math.OC · 2026-04-15 · unverdicted · none · ref 4
Broximal Alignment is a novel condition under which the Ball Proximal Point Method converges to global minima in non-convex settings, generalizing quasiconvexity, star convexity, and related frameworks.
On the Decompositionality of Neural Networks cs.LO · 2026-04-09 · unverdicted · none · ref 25
Neural decompositionality is defined via decision-boundary semantic preservation, and language transformers largely satisfy it under SAVED while vision models often do not.
Accelerating Inference for Multilayer Neural Networks with Quantum Computers quant-ph · 2025-10-08 · unverdicted · none · ref 1
Quantum circuits for coherent multilayer neural network inference achieve quadratic to polylogarithmic speedups over classical methods depending on quantum data access models for inputs and weights.
Non-markovian neural quantum propagator and its application to the simulation of ultrafast nonlinear spectra physics.chem-ph · 2024-08-01 · unverdicted · none · ref 28
A machine learning model called neural quantum propagator is introduced to efficiently solve non-Markovian quantum dynamics described by HEOM and applied to simulate spectra of the FMO complex.
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis eess.SP · 2026-05-16 · unverdicted · none · ref 265
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
Extracting redshifts from 2D slitless spectroscopic images using deep learning for the CSST galaxy survey astro-ph.IM · 2026-05-16 · unverdicted · none · ref 26
A Bayesian CNN maps 2D slitless spectral images to redshift estimates with NMAD precision 0.0104 for SNR_GI >=1 and better for brighter sources, while remaining robust to wavelength calibration errors via spatial augmentations.
Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality cs.CV · 2026-05-11 · accept · none · ref 20
Scaling vision models by depth and parameter count does not consistently improve localisation-based explanation quality across architectures, datasets, and post-hoc methods; smaller models often perform comparably or better.
Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping cs.CV · 2026-05-07 · conditional · none · ref 41
Mixing real UAV imagery with 2101 AI-generated image-mask pairs improves semantic segmentation F1 scores for fine-grained forest species by over 15 percentage points overall and up to 30 points for rare classes.
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management cs.LG · 2026-05-04 · unverdicted · none · ref 223
MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
Lottery BP: Unlocking Quantum Error Decoding at Scale cs.AR · 2026-04-28 · unverdicted · none · ref 52
Lottery BP adds randomness to belief propagation decoding and uses syndrome voting to achieve far higher accuracy on topological quantum codes while reducing reliance on expensive global decoders.
Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images cs.CV · 2026-04-27 · unverdicted · none · ref 15
TSMNet uses a dual-branch text encoder and text-guided fusion module to integrate scene-level semantic and object-level label features from text with visual embeddings, achieving superior open-vocabulary segmentation on new multimodal remote sensing datasets.
Mistake gating leads to energy and memory efficient continual learning cs.AI · 2026-04-15 · unverdicted · none · ref 15
Mistake-gated plasticity reduces neural network updates by 50-80% by gating changes on classification errors, improving efficiency for continual learning without added hyperparameters.
Extraction of linearized models from pre-trained networks via knowledge distillation cs.LG · 2026-04-08 · unverdicted · none · ref 1
Koopman theory plus knowledge distillation yields linearized models from pre-trained nets that outperform standard least-squares Koopman approximations on MNIST and Fashion-MNIST in accuracy and stability.
Neural Networks With Dense Weights Are Not Universal Approximators cs.LG · 2026-02-07 · unverdicted · none · ref 4 · 2 links
Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.
iPDB -- Optimizing Semantic SQL Queries cs.DB · 2026-01-23 · unverdicted · none · ref 12
iPDB adds a predict operator and semantic query optimizations to SQL so that LLM and ML calls run efficiently inside the database, delivering 2.5x average and up to 30x speedup over prior systems.
Fusion or Confusion? Multimodal Complexity Is Not All You Need cs.LG · 2025-12-28 · unverdicted · none · ref 23
Complex multimodal architectures do not reliably outperform unimodal baselines or a simple multimodal baseline under standardized evaluation.
HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability cs.LG · 2025-12-08 · unverdicted · none · ref 44
HOLE applies persistent homology to latent embeddings in neural networks and uses visualizations such as cluster flow diagrams to reveal patterns of class separation, feature disentanglement, and robustness.
Pulse Shape Discrimination Algorithms: Survey and Benchmark cs.LG · 2025-08-03 · conditional · none · ref 44
A survey and benchmark of ~60 PSD algorithms on two radiation datasets finds deep learning models (MLPs and hybrids) often outperform traditional statistical methods, with an open-source Python/MATLAB toolbox and datasets released.
Relating Simple Sentence Representations in Deep Neural Networks and the Brain cs.CL · 2019-06-27 · unverdicted · none · ref 13
BERT activations show strongest correlation with MEG data for simple sentences; DNN representations generate synthetic brain data that improves stimuli decoding accuracy.
A Simulation Methodology Testbed for Typhoon Sensitivity Analysis: Framework Development and Perturbation-Response Experiments with the Pangu Weather Model physics.ao-ph · 2026-05-21 · unverdicted · none · ref 17
A MATLAB/ONNX testbed integrates the Pangu AI model with PID closed-loop control to perform single-input single-output perturbation-response experiments on typhoon track and intensity.
Classification of Single and Mixed Partial Discharges under Switching Voltage Using an AWA-CNN Framework cs.LG · 2026-05-20 · unverdicted · none · ref 31
AWA patterns from PD pulse amplitude, width, and area enable CNNs to classify single and mixed partial discharge sources under switching voltage with over 96% test accuracy.
Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model cs.CV · 2026-05-18 · unverdicted · none · ref 33
A domain-adapted diffusion model synthesizes heterogeneous PET images from uniform organ activity maps, achieving high quantitative accuracy (CCC > 0.92) and visual realism comparable to real scans.
Foundation Models for Credit Risk Prediction: A Game Changer? cs.LG · 2026-05-18 · unverdicted · none · ref 146
Tabular foundation models outperform standard methods in credit risk PD and LGD tasks, with larger gains on smaller datasets when used out-of-the-box.
Interpretable Neural Networks to Predict Momentum Fluxes of Orographic Gravity Waves physics.ao-ph · 2026-05-06 · conditional · none · ref 43
Neural networks predict orographic gravity wave momentum fluxes from coarse state variables with offline R² of 0.56-0.72, learn physically meaningful relationships via SHAP, and are compared to the Lott-Miller parameterization.
Machine Learning Enhanced Laser Spectroscopy for Multi-Species Gas Detection in Complex and Harsh Environments physics.optics · 2026-05-02 · unverdicted · none · ref 65
Machine learning methods including denoising autoencoders, unsupervised interference mitigation, blind source separation, and certifiable classification are developed and experimentally validated to improve multi-species laser spectroscopy under complex conditions.
Predicting Associations between Solar Flares and Coronal Mass Ejections Using SDO/HMI Magnetograms and a Hybrid Neural Network astro-ph.SR · 2026-04-11 · unverdicted · none · ref 35
Hybrid neural network predicts eruptive versus confined solar flares from SDO/HMI magnetogram sequences, reports good performance, and links results to magnetic flux cancellation in polarity inversion lines.
Determination of Nanoparticle and Microdroplet Parameters in Levitating Microdroplets of Suspension by Speckle Image Analysis Using Convolutional Neural Networks physics.app-ph · 2026-04-08 · unverdicted · none · ref 11
CNNs trained on speckle images from levitating TiO2 suspension microdroplets classify droplet diameter with better than 6% accuracy and provide useful discrimination for nanoparticle concentration and diameter, including simultaneous three-parameter classification.
Operator-Theoretic Energy Functionals for Impulse-Excited Nonstationary Signal Analysis eess.SP · 2026-04-07 · unverdicted · none · ref 26
An operator-based Energy Concentration Index yields the IMRED detector that identifies defect-induced changes in impulse responses with AUC 0.908, outperforming standard Fourier and wavelet energy measures.
Vanishing Contributions: A Unified Framework for Smooth and Iterative Model Compression cs.LG · 2025-10-09 · unverdicted · none · ref 1
VCON is a unified framework for smooth iterative DNN compression that uses parallel execution and an affine combination to progressively replace the original model with its compressed form during fine-tuning.
Bayesian Reasoning for Physics Informed Neural Networks physics.comp-ph · 2023-08-25 · unverdicted · none · ref 1
Introduces Laplace-approximated Bayesian PINNs for automatic loss-weight optimization when solving PDEs such as heat, wave, and Burgers equations.
General Inverse Design of Thin-Film Metamaterials With Convolutional Neural Networks physics.comp-ph · 2021-03-29 · unverdicted · none · ref 40
Convolutional neural networks are shown to perform inverse design of thin-film metamaterial stacks by learning the mapping from structure to ellipsometric and reflectance/transmittance spectra, with efficiency gains over traditional optimization as layer count increases.
Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models cs.LG · 2026-05-19 · unverdicted · none · ref 21
A 354-parameter shallow-deep neural network using age, AST, ALT, platelets and FIB-4 achieved external ROC-AUCs of 0.77 and 0.67 for advanced MASLD fibrosis, slightly above FIB-4's 0.75 and 0.60 on Malaysian and Indian cohorts.
A Systematic Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation cs.CV · 2026-05-16 · unverdicted · none · ref 48
A systematic literature survey that categorizes deep learning architectures for point cloud classification, part segmentation, and semantic segmentation, evaluates them on benchmarks, and discusses innovations, limitations, and future directions.
Joint sparse coding and temporal dynamics support context reconfiguration q-bio.NC · 2026-05-11 · unverdicted · none · ref 8
Joint sparse coding and temporal dynamics in mPFC and computational networks reduce cross-context interference and enhance separability, enabling better retention in lifelong learning without extra heuristics.
Single-Cycle Multidirectional EOG Classification Faster than Human Reaction Time for Wearable Human-Computer Interactions eess.SP · 2026-04-27 · unverdicted · none · ref 23
Cascaded neural networks classify 10 eye-movement classes from single-cycle EOG signals at 99% accuracy with sub-83 ms latency below human reaction time.
Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization cs.CV · 2026-04-13 · unverdicted · none · ref 21
DINO-based ViT models pretrained on HPA FOV achieve macro F1 of 0.822 zero-shot and 0.860 after fine-tuning for protein localization on OpenCell, demonstrating effective transfer from SSL pretraining.
The ZTF-ULTRASAT experiment: Characterizing the non-transients in ULTRASAT's high cadence survey astro-ph.SR · 2026-04-08 · unverdicted · none · ref 45
ZTF high-cadence data shows RR Lyrae stars and flaring sources can mimic UV transients, with pre-existing ML catalogs offering a concrete mitigation approach.
Machine Learning-Based Cluster Classification to Suppress Background in a Prototype RPC Detector physics.ins-det · 2026-03-30 · unverdicted · none · ref 10
Machine learning classifiers using fifteen cluster-level descriptors from time and ADC distributions effectively separate signal from background hits in prototype RPC detectors.
Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI cs.CY · 2024-12-02 · unverdicted · none · ref 50
Experts rate AI scenarios as more likely, less risky, more beneficial, and more valuable than the public, applying different weightings to risk versus benefit.
Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis cs.CV · 2024-07-21 · unverdicted · none · ref 11
A framework segments panoramic video into sub-images for detection, modifies multi-object tracking for boundary continuity, and applies it to vehicle overtaking detection in real cycling videos, reporting gains in precision and an F-score of 0.82.
A Proof-of-Concept Simulation-Driven Digital Twin Framework for Decision-Aware Diabetes Modeling cs.LG · 2026-05-11 · unverdicted · none · ref 30
A simulation-driven digital twin framework is shown to generate interpretable diabetes trajectories for decision-aware analysis by combining benchmark data with controlled synthetic scenarios.
A Specialized Importance-Aware Quantum Convolutional Neural Network with Ring-Topology (IA-QCNN) for MGMT Promoter Methylation Prediction in Glioblastoma quant-ph · 2026-04-24 · unverdicted · none · ref 41
IA-QCNN applies quantum principles via ring-topology convolution and importance weighting to achieve claimed high-accuracy MGMT methylation prediction from MRI with fewer parameters and noise robustness than classical models.
Supplementary Materials to Graph Convolutional Branch and Bound cs.LG · 2024-06-05 · unverdicted · none · ref 5
Supplementary results on 1-tree relaxation performance inside a GCN-augmented branch-and-bound solver for TSP.
MiniGPT: Rebuilding GPT from First Principles cs.CL · 2026-05-17 · conditional · none · ref 32
MiniGPT is a self-contained PyTorch implementation of standard GPT autoregressive modeling that reaches 1.478 validation loss on Tiny Shakespeare with a 10.77M-parameter model and produces recognizable Shakespeare-style text.
AI-Powered Surrogate Modelling for Multiscale Combustion: A Critical Review and Opportunities physics.chem-ph · 2026-04-28 · unverdicted · none · ref 37
A critical review of AI surrogate models for multiscale combustion that compares supervised, unsupervised, and physics-guided methods, identifies transferability and consistency challenges, and outlines future opportunities.
Enhancing Laser Surface Texturing through Advanced Machine Learning Techniques cond-mat.mtrl-sci · 2026-04-14 · unverdicted · none · ref 10
Neural networks and random forests predict surface roughness from laser parameters and material data with high accuracy, speeding up optimization and reducing experimental effort.

Deep learning,

citation-role summary

citation-polarity summary

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer