hub Mixed citations

& Sun, J

He, K · 2015 · cs.CV · arXiv 1502.01852

Mixed citation behavior. Most common role is background (67%).

17 Pith papers citing it

Background 67% of classified citations

open full Pith review browse 17 citing papers arXiv PDF

abstract

Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5 method 1

citation-polarity summary

background 4 support 1 use method 1

representative citing papers

U-Net: Convolutional Networks for Biomedical Image Segmentation

cs.CV · 2015-05-18 · accept · novelty 8.0

A u-shaped fully-convolutional encoder-decoder with skip connections trained with elastic-deformation augmentation produces accurate biomedical image segmentations from very small training sets.

Determining star formation histories and age-metallicity relations with convolutional neural networks

astro-ph.GA · 2026-05-13 · unverdicted · novelty 7.0

A CNN with attention and shared latent space recovers SFHs and metallicities from spectro-photometric data with ~0.12 dex age and ~0.03 dex metallicity dispersion while running thousands of times faster than full spectral fitting.

Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies

cs.NE · 2026-02-26 · unverdicted · novelty 7.0

Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.

Criticality and Saturation in Orthogonal Neural Networks

cs.LG · 2026-05-07 · conditional · novelty 7.0

Derives layer-wise recursions for finite-width tensors under orthogonal initialization that reproduce the observed large-depth stability of nonlinear networks.

A Theory of Saddle Escape in Deep Nonlinear Networks

cs.LG · 2026-05-02 · conditional · novelty 7.0 · 2 refs

An exact norm-imbalance identity classifies activations into four classes and reduces deep nonlinear training flow to a scalar ODE that predicts saddle escape time scaling as ε to the power of minus (r-2) for r bottleneck layers.

Progressive Growing of GANs for Improved Quality, Stability, and Variation

cs.NE · 2017-10-27 · accept · novelty 7.0

Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.

Wide Residual Networks

cs.CV · 2016-05-23 · accept · novelty 7.0

Wide residual networks achieve higher accuracy and faster training than very deep thin residual networks by increasing width and decreasing depth, setting new state-of-the-art results on CIFAR, SVHN, and ImageNet.

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

cs.CV · 2015-06-10 · accept · novelty 7.0

LSUN dataset of one million images per category across 30 classes is constructed via iterative human-in-the-loop deep learning labeling.

Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network

cs.CL · 2025-10-02 · unverdicted · novelty 6.0

Introduces FraudSquad, a hybrid model using language model embeddings and a gated graph transformer that outperforms baselines on newly created LLM-generated spam review datasets.

Safe Policy Improvement with Soft Baseline Bootstrapping

cs.LG · 2019-07-11 · unverdicted · novelty 6.0

Extends SPIBB with soft uncertainty-constrained policy search for less conservative safe policy improvement in batch RL, with optimal and approximate solvers shown empirically on finite and neural MDPs.

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

A hybrid denoising autoencoder with supervised head learns latent urban features to select bike station expansion candidates via latent-space similarity, producing 32 consensus high-confidence zones in Trondheim.

MONAI: An open-source framework for deep learning in healthcare

cs.LG · 2022-11-04 · accept · novelty 6.0

MONAI is a community-supported PyTorch framework that extends deep learning to medical data with domain-specific architectures, transforms, and deployment tools.

Learning Minimal Representations of Many-Body Physics from Snapshots of a Quantum Simulator

quant-ph · 2025-09-17 · unverdicted · novelty 5.0

A VAE learns a minimal latent representation from noisy quantum simulator snapshots that correlates with the sine-Gordon equilibrium parameter and detects anomalous post-quench dynamics including frozen-in solitons.

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

cs.LG · 2026-04-24 · unverdicted · novelty 5.0

Toeplitz MLP Mixers replace attention with masked Toeplitz multiplications for sub-quadratic complexity while retaining more sequence information and outperforming on copying and in-context tasks.

Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems

cs.LG · 2019-07-16 · unverdicted · novelty 4.0

Experiments show that shifted-ReLU layers can replace batch-normalization in single-bit-weight wide residual networks on CIFAR-10/100 and ImageNet without consistent accuracy penalty.

Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent

cs.AI · 2026-02-23 · unverdicted · novelty 2.0

A survey provides a task-based formalization of meta-learning and meta-RL while chronicling algorithms that lead to DeepMind's Adaptive Agent.

Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics

cs.LG · 2022-12-18 · unverdicted · novelty 2.0

A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.

citing papers explorer

Showing 17 of 17 citing papers.

U-Net: Convolutional Networks for Biomedical Image Segmentation cs.CV · 2015-05-18 · accept · none · ref 5
A u-shaped fully-convolutional encoder-decoder with skip connections trained with elastic-deformation augmentation produces accurate biomedical image segmentations from very small training sets.
Determining star formation histories and age-metallicity relations with convolutional neural networks astro-ph.GA · 2026-05-13 · unverdicted · none · ref 34 · internal anchor
A CNN with attention and shared latent space recovers SFHs and metallicities from spectro-photometric data with ~0.12 dex age and ~0.03 dex metallicity dispersion while running thousands of times faster than full spectral fitting.
Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies cs.NE · 2026-02-26 · unverdicted · none · ref 28 · internal anchor
Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.
Criticality and Saturation in Orthogonal Neural Networks cs.LG · 2026-05-07 · conditional · none · ref 2
Derives layer-wise recursions for finite-width tensors under orthogonal initialization that reproduce the observed large-depth stability of nonlinear networks.
A Theory of Saddle Escape in Deep Nonlinear Networks cs.LG · 2026-05-02 · conditional · none · ref 19 · 2 links
An exact norm-imbalance identity classifies activations into four classes and reduces deep nonlinear training flow to a scalar ODE that predicts saddle escape time scaling as ε to the power of minus (r-2) for r bottleneck layers.
Progressive Growing of GANs for Improved Quality, Stability, and Variation cs.NE · 2017-10-27 · accept · none · ref 18
Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Wide Residual Networks cs.CV · 2016-05-23 · accept · none · ref 12
Wide residual networks achieve higher accuracy and faster training than very deep thin residual networks by increasing width and decreasing depth, setting new state-of-the-art results on CIFAR, SVHN, and ImageNet.
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop cs.CV · 2015-06-10 · accept · none · ref 7
LSUN dataset of one million images per category across 30 classes is constructed via iterative human-in-the-loop deep learning labeling.
Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network cs.CL · 2025-10-02 · unverdicted · none · ref 15 · internal anchor
Introduces FraudSquad, a hybrid model using language model embeddings and a gated graph transformer that outperforms baselines on newly created LLM-generated spam review datasets.
Safe Policy Improvement with Soft Baseline Bootstrapping cs.LG · 2019-07-11 · unverdicted · none · ref 1 · internal anchor
Extends SPIBB with soft uncertainty-constrained policy search for less conservative safe policy improvement in batch RL, with optimal and approximate solvers shown empirically on finite and neural MDPs.
Similarity-Based Bike Station Expansion via Hybrid Denoising Autoencoders cs.LG · 2026-04-17 · unverdicted · none · ref 5
A hybrid denoising autoencoder with supervised head learns latent urban features to select bike station expansion candidates via latent-space similarity, producing 32 consensus high-confidence zones in Trondheim.
MONAI: An open-source framework for deep learning in healthcare cs.LG · 2022-11-04 · accept · none · ref 26
MONAI is a community-supported PyTorch framework that extends deep learning to medical data with domain-specific architectures, transforms, and deployment tools.
Learning Minimal Representations of Many-Body Physics from Snapshots of a Quantum Simulator quant-ph · 2025-09-17 · unverdicted · none · ref 74 · internal anchor
A VAE learns a minimal latent representation from noisy quantum simulator snapshots that correlates with the sine-Gordon equilibrium parameter and detects anomalous post-quench dynamics including frozen-in solitons.
Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models cs.LG · 2026-04-24 · unverdicted · none · ref 71
Toeplitz MLP Mixers replace attention with masked Toeplitz multiplications for sub-quadratic complexity while retaining more sequence information and outperforming on copying and in-context tasks.
Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems cs.LG · 2019-07-16 · unverdicted · none · ref 16 · internal anchor
Experiments show that shifted-ReLU layers can replace batch-normalization in single-bit-weight wide residual networks on CIFAR-10/100 and ImageNet without consistent accuracy penalty.
Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent cs.AI · 2026-02-23 · unverdicted · none · ref 66 · internal anchor
A survey provides a task-based formalization of meta-learning and meta-RL while chronicling algorithms that lead to DeepMind's Adaptive Agent.
Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics cs.LG · 2022-12-18 · unverdicted · none · ref 63 · internal anchor
A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.

& Sun, J

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer