P2SGrad: Refined Gradients for Optimizing Deep Face Models

Hongsheng Li; Junjie Yan; Mengya Gao; Rui Zhao; Xiaogang Wang; Xiao Zhang; Yu Qiao

arxiv: 1905.02479 · v1 · pith:FLCV6VO6new · submitted 2019-05-07 · 💻 cs.CV

P2SGrad: Refined Gradients for Optimizing Deep Face Models

Xiao Zhang , Rui Zhao , Junjie Yan , Mengya Gao , Yu Qiao , Xiaogang Wang , Hongsheng Li This is my paper

classification 💻 cs.CV

keywords p2sgradtrainingdeepfacegradientslossesbenchmarkscosine

0 comments

read the original abstract

Cosine-based softmax losses significantly improve the performance of deep face recognition networks. However, these losses always include sensitive hyper-parameters which can make training process unstable, and it is very tricky to set suitable hyper parameters for a specific dataset. This paper addresses this challenge by directly designing the gradients for adaptively training deep neural networks. We first investigate and unify previous cosine softmax losses by analyzing their gradients. This unified view inspires us to propose a novel gradient called P2SGrad (Probability-to-Similarity Gradient), which leverages a cosine similarity instead of classification probability to directly update the testing metrics for updating neural network parameters. P2SGrad is adaptive and hyper-parameter free, which makes the training process more efficient and faster. We evaluate our P2SGrad on three face recognition benchmarks, LFW, MegaFace, and IJB-C. The results show that P2SGrad is stable in training, robust to noise, and achieves state-of-the-art performance on all the three benchmarks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Towards multi-modal forgery representation learning for AI-generated video detection and localization
cs.CV 2026-05 unverdicted novelty 5.0

A multi-modal model with LMM semantic, ST visual, and PS audio branches enables simultaneous detection and fine-grained temporal localization of partial AI video forgeries, outperforming prior methods.