CosFace: Large Margin Cosine Loss for Deep Face Recognition

Hao Wang , Yitong Wang , Zheng Zhou , Xing Ji , Dihong Gong , Jingchao Zhou , ZhiFeng Li , Wei Liu

Authors on Pith no claims yet

classification 💻 cs.CV

keywords lossfacemargincosinerecognitionsoftmaxvariancedeep

read the original abstract

Face recognition has made extraordinary progress owing to the advancement of deep convolutional neural networks (CNNs). The central task of face recognition, including face verification and identification, involves face feature discrimination. However, the traditional softmax loss of deep CNNs usually lacks the power of discrimination. To address this problem, recently several loss functions such as center loss, large margin softmax loss, and angular softmax loss have been proposed. All these improved losses share the same idea: maximizing inter-class variance and minimizing intra-class variance. In this paper, we propose a novel loss function, namely large margin cosine loss (LMCL), to realize this idea from a different perspective. More specifically, we reformulate the softmax loss as a cosine loss by $L_2$ normalizing both features and weight vectors to remove radial variations, based on which a cosine margin term is introduced to further maximize the decision margin in the angular space. As a result, minimum intra-class variance and maximum inter-class variance are achieved by virtue of normalization and cosine decision margin maximization. We refer to our model trained with LMCL as CosFace. Extensive experimental evaluations are conducted on the most popular public-domain face recognition datasets such as MegaFace Challenge, Youtube Faces (YTF) and Labeled Face in the Wild (LFW). We achieve the state-of-the-art performance on these benchmarks, which confirms the effectiveness of our proposed approach.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Find the Differences: Differential Morphing Attack Detection vs Face Recognition
cs.CV 2026-04 unverdicted novelty 5.0

Face recognition systems can be repurposed for morphing attack detection using a new threshold that bounds vulnerability even to unknown morph types.
Using predefined vector systems to speed up neural network multimillion class classification
cs.LG 2026-04 unverdicted novelty 5.0

Predefined vector systems structure neural network latent spaces to allow O(1) label prediction via index searches on embedding vectors, delivering up to 11.6x speedup on multimillion-class tasks while preserving accu...