Gradient-based learning applied to document recognition

Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner · 1998

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

cs.AI · 2024-07-01 · accept · novelty 7.0

WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.

Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data

cs.LG · 2025-08-20 · unverdicted · novelty 5.0

EdgeFD uses a KMeans-based client-side filter to improve federated distillation accuracy close to IID levels on non-IID data distributions for resource-constrained edge devices.

Efficient compression of neural networks and datasets

cs.LG · 2025-05-23 · unverdicted · novelty 5.0

Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.

citing papers explorer

Showing 3 of 3 citing papers.

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? cs.AI · 2024-07-01 · accept · none · ref 2
WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.
Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data cs.LG · 2025-08-20 · unverdicted · none · ref 24
EdgeFD uses a KMeans-based client-side filter to improve federated distillation accuracy close to IID levels on non-IID data distributions for resource-constrained edge devices.
Efficient compression of neural networks and datasets cs.LG · 2025-05-23 · unverdicted · none · ref 36
Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.

Gradient-based learning applied to document recognition

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer