hub

The computational limits of deep learning

Thompson, N · 2020 · arXiv 2007.05558

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Rates of forgetting for the sequentially Markov coalescent

math.PR · 2026-04-22 · unverdicted · novelty 7.0

SMC forgets its initial condition geometrically in the jump chain and as 1/ℓ in continuous genetic distance, justifying independent-locus approximations.

From Membership-Privacy Leakage to Quantum Machine Unlearning

quant-ph · 2025-09-07 · unverdicted · novelty 7.0

Quantum neural networks exhibit membership privacy leakage that a proposed quantum machine unlearning framework with three mechanisms can mitigate in simulations and cloud device tests.

Recursive Block-Diagonal Coupling for Resource-Efficient Training of Vision Models

cs.CV · 2026-05-22 · unverdicted · novelty 6.0

RBDC trains wide vision models by recursive block-diagonal coupling of narrower pre-trained models, reducing training FLOPs by 30% at similar ImageNet accuracy for DeiT and ResNet while outperforming model growth baselines.

General-Purpose Photonic Computing Primitive for Contemporary Artificial Intelligence

physics.optics · 2026-05-21 · unverdicted · novelty 6.0

DUET is a photonic tensor core paradigm that uses structural symmetry in VODICs to support arbitrary signed operands directly, experimentally tested on image classification, segmentation, and Transformer tasks.

Mixture of Heterogeneous Grouped Experts for Language Modeling

cs.CL · 2026-04-25 · unverdicted · novelty 6.0

MoHGE achieves standard MoE performance with 20% fewer parameters and balanced GPU utilization via grouped heterogeneous experts, two-level routing, and specialized auxiliary losses.

STRIDe: Cross-Coupled STT-MRAM Enabling Robust In-Memory-Computing for Deep Neural Network Accelerators

cs.ET · 2026-04-06 · unverdicted · novelty 6.0

STRIDe cross-coupled STT-MRAM improves sense margin up to 3.86x and read disturb margin up to 27.6% for XNOR and AND IMC, achieving near-software DNN inference accuracy on CIFAR10 despite process variations.

Neural Networks With Dense Weights Are Not Universal Approximators

cs.LG · 2026-02-07 · unverdicted · novelty 6.0 · 2 refs

Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

cs.AI · 2024-08-01 · conditional · novelty 6.0

Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.

Position: LLM Inference Should Be Evaluated as Energy-to-Token Production

cs.CE · 2026-05-12 · unverdicted · novelty 5.0

LLM inference should be reframed and evaluated as energy-to-token production with a Token Production Function that accounts for power, cooling, and efficiency ceilings.

OptiLookUp: An Optical ROM-Based Lookup Table Engine for Photonic Accelerators

physics.optics · 2026-05-05 · unverdicted · novelty 5.0 · 2 refs

The paper introduces OptiLookUp, a reconfigurable photonic ROM using integrated microring resonators in banked sub-arrays with optical decoding and transistor selectors, simulated on GlobalFoundries 45SPCLO platform to operate at 12.5 GHz for activation functions.

Blockchain and AI: Securing Intelligent Networks for the Future

cs.CR · 2026-04-07 · unverdicted · novelty 5.0

Blockchain and AI integration for network security has strong conceptual fit but mostly prototype-level evidence; the paper offers a taxonomy, integration patterns, and the BASE evaluation checklist to organize the field.

A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

math.OC · 2025-12-03 · unverdicted · novelty 5.0

The authors cast auxiliary-loss-free load balancing as a primal-dual assignment solver, prove structural properties in deterministic and online regimes, and report experiments on 1B-parameter DeepSeekMoE models.

Auto-Relational Reasoning

cs.AI · 2026-04-29 · unverdicted · novelty 3.0

A system using auto-relational reasoning solves IQ test problems at 98.03% rate without any prior knowledge, reaching top 1% human performance.

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices

cs.DC · 2025-03-11 · unverdicted · novelty 2.0

Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.

citing papers explorer

Showing 14 of 14 citing papers.

Rates of forgetting for the sequentially Markov coalescent math.PR · 2026-04-22 · unverdicted · none · ref 152
SMC forgets its initial condition geometrically in the jump chain and as 1/ℓ in continuous genetic distance, justifying independent-locus approximations.
From Membership-Privacy Leakage to Quantum Machine Unlearning quant-ph · 2025-09-07 · unverdicted · none · ref 32
Quantum neural networks exhibit membership privacy leakage that a proposed quantum machine unlearning framework with three mechanisms can mitigate in simulations and cloud device tests.
Recursive Block-Diagonal Coupling for Resource-Efficient Training of Vision Models cs.CV · 2026-05-22 · unverdicted · none · ref 29
RBDC trains wide vision models by recursive block-diagonal coupling of narrower pre-trained models, reducing training FLOPs by 30% at similar ImageNet accuracy for DeiT and ResNet while outperforming model growth baselines.
General-Purpose Photonic Computing Primitive for Contemporary Artificial Intelligence physics.optics · 2026-05-21 · unverdicted · none · ref 8
DUET is a photonic tensor core paradigm that uses structural symmetry in VODICs to support arbitrary signed operands directly, experimentally tested on image classification, segmentation, and Transformer tasks.
Mixture of Heterogeneous Grouped Experts for Language Modeling cs.CL · 2026-04-25 · unverdicted · none · ref 23
MoHGE achieves standard MoE performance with 20% fewer parameters and balanced GPU utilization via grouped heterogeneous experts, two-level routing, and specialized auxiliary losses.
STRIDe: Cross-Coupled STT-MRAM Enabling Robust In-Memory-Computing for Deep Neural Network Accelerators cs.ET · 2026-04-06 · unverdicted · none · ref 8
STRIDe cross-coupled STT-MRAM improves sense margin up to 3.86x and read disturb margin up to 27.6% for XNOR and AND IMC, achieving near-software DNN inference accuracy on CIFAR10 despite process variations.
Neural Networks With Dense Weights Are Not Universal Approximators cs.LG · 2026-02-07 · unverdicted · none · ref 11 · 2 links
Dense ReLU networks under natural weight and dimension constraints fail to approximate certain Lipschitz functions, unlike unrestricted networks.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models cs.AI · 2024-08-01 · conditional · none · ref 216
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
Position: LLM Inference Should Be Evaluated as Energy-to-Token Production cs.CE · 2026-05-12 · unverdicted · none · ref 72
LLM inference should be reframed and evaluated as energy-to-token production with a Token Production Function that accounts for power, cooling, and efficiency ceilings.
OptiLookUp: An Optical ROM-Based Lookup Table Engine for Photonic Accelerators physics.optics · 2026-05-05 · unverdicted · none · ref 2 · 2 links
The paper introduces OptiLookUp, a reconfigurable photonic ROM using integrated microring resonators in banked sub-arrays with optical decoding and transistor selectors, simulated on GlobalFoundries 45SPCLO platform to operate at 12.5 GHz for activation functions.
Blockchain and AI: Securing Intelligent Networks for the Future cs.CR · 2026-04-07 · unverdicted · none · ref 131
Blockchain and AI integration for network security has strong conceptual fit but mostly prototype-level evidence; the paper offers a taxonomy, integration patterns, and the BASE evaluation checklist to organize the field.
A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models math.OC · 2025-12-03 · unverdicted · none · ref 13
The authors cast auxiliary-loss-free load balancing as a primal-dual assignment solver, prove structural properties in deterministic and online regimes, and report experiments on 1B-parameter DeepSeekMoE models.
Auto-Relational Reasoning cs.AI · 2026-04-29 · unverdicted · none · ref 6
A system using auto-relational reasoning solves IQ test problems at 98.03% rate without any prior knowledge, reaching top 1% human performance.
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices cs.DC · 2025-03-11 · unverdicted · none · ref 129
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.

The computational limits of deep learning

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer