Generalization analysis with deep ReLU networks for metric and similarity learning

Ding-Xuan Zhou; Junyu Zhou; Puyu Wang

arxiv: 2405.06415 · v2 · pith:2WWXP3NYnew · submitted 2024-05-10 · 📊 stat.ML · cs.LG

Generalization analysis with deep ReLU networks for metric and similarity learning

Junyu Zhou , Puyu Wang , Ding-Xuan Zhou This is my paper

classification 📊 stat.ML cs.LG

keywords metricsimilaritylearninggeneralizationnetworktrueapproximationdeep

0 comments

read the original abstract

While metric and similarity learning has been extensively studied from several theoretical perspectives, a rigorous understanding of its generalization performance is still lacking. In this paper, we investigate the generalization behavior of metric and similarity learning by exploiting the specific structure of the true metric (i.e., the target function). In particular, by deriving the explicit form of the true metric for metric and similarity learning with the hinge loss, we construct a structured deep ReLU neural network as an approximation of the true metric, whose approximation ability depends on the network complexity. Here, the network complexity is characterized by the network depth, the number of nonzero weights, and the number of computational units. Based on the hypothesis space consisting of such structured deep ReLU networks, we establish excess risk bounds for metric and similarity learning by carefully controlling both the approximation error and the estimation error. An explicit excess risk rate is derived by choosing the proper capacity of the constructed hypothesis space. To the best of our knowledge, this is the first generalization analysis that provides explicit excess risk bounds for metric and similarity learning. In addition, we investigate properties of the true metric for metric and similarity learning under more general loss functions. Experiments show that the proposed model is empirically competitive and better captures the underlying similarity structure.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Population Risk Bounds for Kolmogorov-Arnold Networks Trained by DP-SGD with Correlated Noise
cs.LG 2026-05 unverdicted novelty 8.0

First population risk bounds for KANs under mini-batch DP-SGD with correlated noise, using a new non-convex optimization analysis combined with stability-based generalization.
Statistical learnability of smooth boundaries via pairwise binary classification with deep ReLU networks
math.ST 2025-01 unverdicted novelty 6.0

Proves learnability of ordered multiple smooth boundaries in pairwise binary classification via localized deep ReLU networks.