Improving Facial Emotion Recognition through Dataset Merging and Balanced Training Strategies

Serap K{\i}rb{\i}z

arxiv: 2604.20307 · v1 · submitted 2026-04-22 · 💻 cs.CV

Improving Facial Emotion Recognition through Dataset Merging and Balanced Training Strategies

Serap K{\i}rb{\i}z This is my paper

Pith reviewed 2026-05-09 23:58 UTC · model grok-4.3

classification 💻 cs.CV

keywords facial emotion recognitiondataset mergingclass imbalancedata augmentationweighted samplingdeep convolutional networksCK+FER+

0 comments

The pith

Merging the CK+, FER+, and KDEF datasets plus augmentation and weighted sampling lets a deep CNN classify seven basic facial emotions at 82% accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that combining three public facial emotion datasets creates a larger and more diverse training resource, while augmentation and random weighted sampling correct the class imbalance that remains after merging. A sympathetic reader would care because many emotion recognition models fail on underrepresented emotions when trained on small or skewed data, limiting their use in applications such as assistive technology or affective computing. The work shows that careful data preparation can raise overall accuracy to 82% without requiring a new network architecture. If the approach holds, it offers a straightforward route to stronger generalization on the standard seven-emotion task.

Core claim

By increasing training data through the merger of CK+, FER+, and KDEF and then applying online and offline augmentation together with random weighted sampling, the deep convolutional network reaches 82% accuracy on the seven basic emotions and demonstrates that these steps directly reduce the performance penalty caused by class imbalance.

What carries the argument

Dataset merging across CK+, FER+, and KDEF together with online/offline augmentation and random weighted sampling to correct remaining class imbalance before training a deep convolutional network.

If this is right

The merged dataset supplies more examples per emotion class, supporting more stable feature learning inside the convolutional network.
Random weighted sampling and augmentation together shrink the accuracy gap between majority and minority emotion classes.
Overall classification reaches 82% on the seven basic emotions, outperforming training on any single source dataset.
The combination of merging and balancing directly mitigates the data imbalance problem that otherwise degrades facial emotion recognition.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same merging-plus-balancing recipe could be tested on other multi-class image tasks that suffer from uneven label distributions.
Real-time systems in human-computer interaction might achieve more reliable emotion detection by adopting this data-preparation pipeline rather than solely changing model depth.
Cross-dataset label noise remains a hidden variable; explicit consistency checks on merged labels would be a natural next measurement.
If the 82% figure generalizes, incremental gains may now come more from refining the balancing weights than from further dataset growth.

Load-bearing premise

Emotion labels remain consistent and compatible when the three datasets are merged and that augmentation plus weighted sampling improves generalization without introducing new biases or overfitting.

What would settle it

Training the same network on each dataset separately without merging or balancing and finding accuracy well below 82%, or testing the merged model on a completely independent dataset such as AffectNet and observing a large drop in performance.

Figures

Figures reproduced from arXiv: 2604.20307 by Serap K{\i}rb{\i}z.

**Figure 2.** Figure 2: The distribution of the data: (a) for each class and for each stage of the data [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: The images from three different datasets: FER+ (first row), CK+ (second row), [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: The seven basic emotions (happy, angry, neutral, fear, disgust, surprise, sad) [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: The confusion matrices obtained by the DenseNet121 model 17 [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

read the original abstract

In this paper, a deep learning framework is proposed for automatic facial emotion based on deep convolutional networks. In order to increase the generalization ability and the robustness of the method, the dataset size is increased by merging three publicly available facial emotion datasets: CK+, FER+ and KDEF. Despite the increase in dataset size, the minority classes still suffer from insufficient number of training samples, leading to data imbalance. The data imbalance problem is minimized by online and offline augmentation techniques and random weighted sampling. Experimental results demonstrate that the proposed method can recognize the seven basic emotions with 82% accuracy. The results demonstrate the effectiveness of the proposed approach in tackling the challenges of data imbalance and improving classification performance in facial emotion recognition.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Routine merging of CK+, FER+, and KDEF plus standard augmentation and weighted sampling reaches 82% on seven-class FER, but adds no new methods and leaves label alignment unaddressed.

read the letter

The paper merges three public facial emotion datasets to increase training size, then applies offline and online augmentation plus random weighted sampling to reduce imbalance before training a deep CNN. It reports 82% accuracy on the seven basic emotions. That is the entire contribution in plain terms. The approach is practical and directly targets a known pain point in FER, where minority classes often have too few samples. The abstract lays out the steps clearly and shows the authors tried to handle the imbalance that remains after merging. Those are the parts that work as described. The evaluation is thin. No baseline accuracies appear for the individual datasets or for prior methods on the same task, so there is no way to tell whether the merging and balancing steps produced a real gain. The larger problem is label consistency across the sources. CK+ and KDEF contain posed expressions with structured coding, while FER+ uses crowd-sourced in-the-wild labels. The abstract gives no mapping table, agreement statistics, or cross-dataset check, so any accuracy number mixes recognition performance with possible annotation noise. The stress-test note on this point holds. The work is a straightforward engineering exercise rather than a methodological advance. It will interest people who maintain or combine FER datasets and need quick examples of balancing tricks. It will not interest readers looking for new algorithms or strong comparative evidence. I would not bring it to a reading group, would not cite it, and would not send it for peer review.

Referee Report

3 major / 1 minor

Summary. The paper proposes a deep convolutional network framework for recognizing seven basic facial emotions. To boost generalization and robustness, it merges the CK+, FER+, and KDEF datasets and mitigates resulting class imbalance via online/offline augmentation plus random weighted sampling. Experimental results are reported to reach 82% accuracy, which the authors interpret as evidence that the merging and balancing strategies successfully address data imbalance and improve classification performance.

Significance. If the 82% figure were shown to be robust against baselines, label harmonization checks, and standard validation protocols, the work would offer a straightforward empirical recipe for enlarging training sets in facial emotion recognition while controlling imbalance. The approach itself is conventional, so its value would lie mainly in the concrete performance lift rather than in novel methodology.

major comments (3)

[Abstract] Abstract and Experimental results section: the central claim of 82% accuracy is presented without any baseline comparisons, cross-validation protocol, or error analysis. Without these, it is impossible to determine whether the reported accuracy reflects genuine improvement from dataset merging and balancing or simply the result of training on a larger but noisier pool.
[Methods] Methods / Dataset merging description: no label-mapping table, inter-annotator agreement statistics, or cross-dataset consistency check is supplied for aligning the seven emotion classes across CK+ (FACS-coded posed expressions), FER+ (crowd-sourced in-the-wild labels), and KDEF (actor-intended posed expressions). Systematic label mismatches would directly undermine the accuracy figure.
[Experimental results] Experimental results: the claim that augmentation and weighted sampling improve generalization is not supported by ablation studies or held-out test details. Standard augmentation cannot correct label noise; therefore the 82% result may conflate recognition performance with annotation artifacts.

minor comments (1)

[Abstract] The abstract states the use of 'deep convolutional networks' but never specifies the exact architecture, input resolution, or training hyperparameters.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract and Experimental results section: the central claim of 82% accuracy is presented without any baseline comparisons, cross-validation protocol, or error analysis. Without these, it is impossible to determine whether the reported accuracy reflects genuine improvement from dataset merging and balancing or simply the result of training on a larger but noisier pool.

Authors: We acknowledge the absence of explicit baseline comparisons and detailed validation protocols in the current version. The 82% accuracy was obtained on the merged dataset using the proposed balancing strategies. In the revision we will add baseline results from models trained on each individual dataset (CK+, FER+, KDEF) separately, specify the cross-validation protocol (5-fold stratified), and include a confusion matrix plus per-class precision/recall for error analysis. These additions will allow direct assessment of whether the performance lift stems from merging and balancing. revision: yes
Referee: [Methods] Methods / Dataset merging description: no label-mapping table, inter-annotator agreement statistics, or cross-dataset consistency check is supplied for aligning the seven emotion classes across CK+ (FACS-coded posed expressions), FER+ (crowd-sourced in-the-wild labels), and KDEF (actor-intended posed expressions). Systematic label mismatches would directly undermine the accuracy figure.

Authors: We will insert a label-mapping table in the revised Methods section that explicitly shows the correspondence of the seven emotion categories across the three datasets. All datasets use the same seven-class taxonomy, and mapping followed the original annotations. Inter-annotator agreement statistics are unavailable for FER+ in its public release, so we cannot generate new figures; we will instead add a brief discussion of each dataset's labeling provenance and known variability. We will also describe the manual sample verification performed during merging to check label consistency. revision: partial
Referee: [Experimental results] Experimental results: the claim that augmentation and weighted sampling improve generalization is not supported by ablation studies or held-out test details. Standard augmentation cannot correct label noise; therefore the 82% result may conflate recognition performance with annotation artifacts.

Authors: We will add ablation experiments that isolate the contribution of offline augmentation, online augmentation, and random weighted sampling. The test-set split (20% held-out, stratified by class and dataset source) will be detailed. While we agree that augmentation cannot remove label noise, the weighted sampling was introduced precisely to improve minority-class generalization on the merged data; we will discuss the potential influence of annotation artifacts and note that the multi-source training provides a form of robustness check. revision: yes

standing simulated objections not resolved

Inter-annotator agreement statistics for FER+, which are not provided in the original dataset release and cannot be retroactively computed without re-annotating the data.

Circularity Check

0 steps flagged

No circularity: purely empirical ML pipeline with held-out evaluation

full rationale

The paper describes a standard supervised learning workflow: merge three public datasets (CK+, FER+, KDEF), apply online/offline augmentation and weighted sampling to address class imbalance, train a deep CNN, and report accuracy on held-out test data. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text or abstract. The 82% accuracy figure is obtained by direct training and evaluation rather than by algebraic reduction to the input data or prior author results. This is the expected non-circular outcome for an empirical computer-vision experiment.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract mentions no free parameters, axioms, or invented entities. The approach relies on standard supervised deep learning practices whose details are not specified.

pith-pipeline@v0.9.0 · 5413 in / 1120 out tokens · 53648 ms · 2026-05-09T23:58:03.633510+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

Dhuheir, A

M. Dhuheir, A. Albaseer, E. Baccour, A. Erbad, M. Abdallah, M. Hamdi, Emotion recognition for healthcare surveillance systems us- ing neural networks: A survey, in: 2021 International Wireless Commu- nications and Mobile Computing (IWCMC), IEEE, 2021, pp. 681–687

work page 2021
[2]

Ribeiro, G

B. Ribeiro, G. Oliveira, A. Laranjeira, J. P. Arrais, Deep learning in digital marketing: brand detection and emotion recognition, Interna- tional Journal of Machine Intelligence and Sensory Signal Processing 2 (2017) 32–50

work page 2017
[3]

Abdat, C

F. Abdat, C. Maaoui, A. Pruski, Human-computer interaction using emotion recognition from facial expression, in: 2011 UKSim 5th Euro- pean Symposium on Computer Modeling and Simulation, IEEE, 2011, pp. 196–201

work page 2011
[4]

Y. An, J. Lee, E. Bak, S. Pan, Deep facial emotion recognition using local features based on facial landmarks for security system., Computers, Materials & Continua 76 (2023). 19

work page 2023
[5]

Ekman, W

P. Ekman, W. V. Friesen, Constants across cultures in the face and emotion., Journal of personality and social psychology 17 (1971) 124

work page 1971
[6]

Matsumoto, More evidence for the universality of a contempt ex- pression, Motivation and Emotion 16 (1992) 363–368

D. Matsumoto, More evidence for the universality of a contempt ex- pression, Motivation and Emotion 16 (1992) 363–368

work page 1992
[7]

C. Shan, S. Gong, P. W. McOwan, Facial expression recognition based on local binary patterns: A comprehensive study, Image and vision Computing 27 (2009) 803–816

work page 2009
[8]

Y. Shi, Z. Lv, N. Bi, C. Zhang, An improved sift algorithm for robust emotion recognition under various face poses and illuminations, Neural Computing and Applications 32 (2020) 9267–9281

work page 2020
[9]

J. Zhou, S. Zhang, H. Mei, D. Wang, A method of facial expression recognition based on gabor and nmf, Pattern Recognition and Image Analysis 26 (2016) 119–124

work page 2016
[10]

Abdulrahman, A

M. Abdulrahman, A. Eleyan, Facial expression recognition using sup- port vector machines, in: 2015 23nd signal processing and communica- tions applications conference (SIU), IEEE, 2015, pp. 276–279

work page 2015
[11]

P. P. Thakare, P. S. Patil, Facial expression recognition algorithm based on knn classifier, International Journal of Computer Science and Net- work 5 (2016) 941

work page 2016
[12]

Kayao˘ glu, T

B. Kayao˘ glu, T. Tokta¸ s, S. Kırbız, Cnn-based emotion recognition using data augmentation and preprocessing methods, in: 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), IEEE, 2023, pp. 1–4

work page 2023
[13]

Ezerceli, M

¨O. Ezerceli, M. T. Eskil, Convolutional neural network (cnn) algorithm based facial emotion recognition (fer) system for fer-2013 dataset, in: 2022 International Conference on Electrical, Computer, Communica- tions and Mechatronics Engineering (ICECCME), IEEE, 2022, pp. 1–6

work page 2013
[14]

M. R. A. Borgalli, S. Surve, Deep learning for facial emotion recogni- tion using custom cnn architecture, in: Journal of Physics: Conference Series, volume 2236, IOP Publishing, 2022, p. 012004. 20

work page 2022
[15]

S. Z. Jumani, F. Ali, S. Guriro, I. A. Kandhro, A. Khan, A. Zaidi, Facial expression recognition with histogram of oriented gradients using cnn, Indian Journal of Science and Technology 12 (2019) 1–8

work page 2019
[16]

Lucey, J

P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The extended cohn-kanade dataset (ck+): A complete dataset for ac- tion unit and emotion-specified expression, in: 2010 ieee computer so- ciety conference on computer vision and pattern recognition-workshops, IEEE, 2010, pp. 94–101

work page 2010
[17]

M. G. Calvo, D. Lundqvist, Facial expressions of emotion (kdef): Iden- tification under different display-duration conditions, Behavior research methods 40 (2008) 109–115

work page 2008
[18]

Barsoum, C

E. Barsoum, C. Zhang, C. C. Ferrer, Z. Zhang, Training deep networks for facial expression recognition with crowd-sourced label distribution, in: Proceedings of the 18th ACM international conference on multi- modal interaction, 2016, pp. 279–283

work page 2016
[19]

J. Deng, J. Guo, E. Ververas, I. Kotsia, S. Zafeiriou, Retinaface: Single-shot multi-level face localisation in the wild, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 5203–5212

work page 2020
[20]

E. D. Cubuk, B. Zoph, J. Shlens, Q. V. Le, Randaugment: Practical automated data augmentation with a reduced search space, in: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 702–703

work page 2020
[21]

Shekelyan, G

M. Shekelyan, G. Cormode, P. Triantafillou, A. Shanghooshabad, Q. Ma, Weighted random sampling over joins, arXiv preprint arXiv:2201.02670 (2022)

work page arXiv 2022
[22]

Y. H. Kwon, N. da Vitoria Lobo, Age classification from facial images, Computer vision and image understanding 74 (1999) 1–21

work page 1999
[23]

S. Li, W. Deng, Deep facial expression recognition: A survey, IEEE transactions on affective computing 13 (2020) 1195–1215

work page 2020
[24]

Tolba, A

A. Tolba, A. El-Baz, A. El-Harby, Face recognition: A literature review, International Journal of Signal Processing 2 (2006) 88–103. 21

work page 2006
[25]

M. N. Chaudhari, M. Deshmukh, G. Ramrakhiani, R. Parvatikar, Face detection using viola jones algorithm and neural networks, in: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), IEEE, 2018, pp. 1–6

work page 2018
[26]

Merget, M

D. Merget, M. Rock, G. Rigoll, Robust facial landmark detection via a fully-convolutional local-global context network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 781–790

work page 2018
[27]

T. F. Cootes, G. J. Edwards, C. J. Taylor, Active appearance mod- els, IEEE Transactions on pattern analysis and machine intelligence 23 (2001) 681–685

work page 2001
[28]

T. F. Cootes, C. J. Taylor, D. H. Cooper, J. Graham, Active shape models-their training and application, Computer vision and image un- derstanding 61 (1995) 38–59

work page 1995
[29]

M. Kass, A. Witkin, D. Terzopoulos, Snakes: Active contour models, International journal of computer vision 1 (1988) 321–331

work page 1988
[30]

D. Chen, S. Ren, Y. Wei, X. Cao, J. Sun, Joint cascade fac detection and alignment, in: Computer Vision–EECV 2014: 13th European Con- frence, Zurich, Switzerland, September 6-12, 2014, Proceedings, part VI 13, Springer, 2014, pp. 109–122

work page 2014
[31]

Zhang, Z

K. Zhang, Z. Zhang, Z. Li, Y. Qiao, Joint face detection and alignment using multitask cascaded convolutional networks, IEEE signal process- ing letters 23 (2016) 1499–1503

work page 2016
[32]

K. He, G. Gkioxari, P. Doll´ ar, R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961– 2969

work page 2017
[33]

Chaudhuri, N

B. Chaudhuri, N. Vesdapunt, B. Wang, Joint face detection and facial motion retargeting for multiple faces, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9719–9728

work page 2019
[34]

E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q. V. Le, Autoaug- ment: Learning augmentation strategies from data, in: Proceedings of 22 the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 113–123

work page 2019
[35]

11 Published as a conference paper at ICLR 2025 Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshmi- narayanan

D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, B. Laksh- minarayanan, Augmix: A simple data processing method to improve robustness and uncertainty, arXiv preprint arXiv:1912.02781 (2019)

work page arXiv 1912
[36]

Kırbız, Facial emotion recognition using residual neural networks, Electrica 24 (2024) 818–825

S. Kırbız, Facial emotion recognition using residual neural networks, Electrica 24 (2024) 818–825

work page 2024
[37]

Prakasa, Texture feature extraction by using local binary pattern, INKOM Journal 9 (2016) 45–48

E. Prakasa, Texture feature extraction by using local binary pattern, INKOM Journal 9 (2016) 45–48

work page 2016
[38]

H. Kaya, F. G¨ urpınar, A. A. Salah, Video-based emotion recognition in the wild using deep transfer learning and score fusion, Image and Vision Computing 65 (2017) 66–75

work page 2017
[39]

K. Wang, X. Peng, J. Yang, D. Meng, Y. Qiao, Region attention net- works for pose and occlusion robust facial expression recognition, IEEE Transactions on Image Processing 29 (2020) 4057–4069

work page 2020
[40]

Minaee, M

S. Minaee, M. Minaei, A. Abdolrashidi, Deep-emotion: Facial expression recognition using attentional convolutional network, Sensors 21 (2021) 3046

work page 2021
[41]

Zhang, Y

K. Zhang, Y. Huang, Y. Du, L. Wang, Facial expression recognition based on deep evolutional spatial-temporal networks, IEEE Transactions on Image Processing 26 (2017) 4193–4203

work page 2017
[42]

Z. Wang, K. Zhang, W. Luo, R. Sankaranarayana, Htnet for micro- expression recognition, Neurocomputing 602 (2024) 128196

work page 2024
[43]

W. Niu, K. Zhang, D. Li, W. Luo, Four-player groupgan for weak expression recognition via latent expression magnification, Knowledge- Based Systems 251 (2022) 109304

work page 2022
[44]

S. Yang, P. Luo, C.-C. Loy, X. Tang, Wider face: A face detection benchmark, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525–5533. 23

work page 2016
[45]

Garatti, A

S. Garatti, A. Car` e, M. C. Campi, Complexity is an effective observable to tune early stopping in scenario optimization, IEEE Transactions on Automatic Control 68 (2022) 928–942

work page 2022
[46]

Patwal, M

A. Patwal, M. Diwakar, A. Joshi, P. Singh, Facial expression recognition using densenet, in: 2022 OITS International Conference on Information Technology (OCIT), IEEE, 2022, pp. 548–552

work page 2022
[47]

Utami, R

P. Utami, R. Hartanto, I. Soesanti, The efficientnet performance for facial expressions recognition, in: 2022 5th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 2022, pp. 756–762

work page 2022
[48]

Zhong, J

Z. Zhong, J. Li, L. Ma, H. Jiang, H. Zhao, Deep residual networks for hy- perspectral image classification, in: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), IEEE, 2017, pp. 1824–1827

work page 2017
[49]

T. A. Araf, A. Siddika, S. Karimi, M. G. R. Alam, Real-time face emotion recognition and visualization using grad-cam, in: 2022 Second International Conference on Advances in Electrical, Computing, Com- munication and Sustainable Technologies (ICAECT), IEEE, 2022, pp. 1–5

work page 2022
[50]

Lorch, J

S. Lorch, J. Gebele, P. Brune, Towards trustworthy ai: Evaluating shap and lime for facial emotion recognition (2025). 24

work page 2025

[1] [1]

Dhuheir, A

M. Dhuheir, A. Albaseer, E. Baccour, A. Erbad, M. Abdallah, M. Hamdi, Emotion recognition for healthcare surveillance systems us- ing neural networks: A survey, in: 2021 International Wireless Commu- nications and Mobile Computing (IWCMC), IEEE, 2021, pp. 681–687

work page 2021

[2] [2]

Ribeiro, G

B. Ribeiro, G. Oliveira, A. Laranjeira, J. P. Arrais, Deep learning in digital marketing: brand detection and emotion recognition, Interna- tional Journal of Machine Intelligence and Sensory Signal Processing 2 (2017) 32–50

work page 2017

[3] [3]

Abdat, C

F. Abdat, C. Maaoui, A. Pruski, Human-computer interaction using emotion recognition from facial expression, in: 2011 UKSim 5th Euro- pean Symposium on Computer Modeling and Simulation, IEEE, 2011, pp. 196–201

work page 2011

[4] [4]

Y. An, J. Lee, E. Bak, S. Pan, Deep facial emotion recognition using local features based on facial landmarks for security system., Computers, Materials & Continua 76 (2023). 19

work page 2023

[5] [5]

Ekman, W

P. Ekman, W. V. Friesen, Constants across cultures in the face and emotion., Journal of personality and social psychology 17 (1971) 124

work page 1971

[6] [6]

Matsumoto, More evidence for the universality of a contempt ex- pression, Motivation and Emotion 16 (1992) 363–368

D. Matsumoto, More evidence for the universality of a contempt ex- pression, Motivation and Emotion 16 (1992) 363–368

work page 1992

[7] [7]

C. Shan, S. Gong, P. W. McOwan, Facial expression recognition based on local binary patterns: A comprehensive study, Image and vision Computing 27 (2009) 803–816

work page 2009

[8] [8]

Y. Shi, Z. Lv, N. Bi, C. Zhang, An improved sift algorithm for robust emotion recognition under various face poses and illuminations, Neural Computing and Applications 32 (2020) 9267–9281

work page 2020

[9] [9]

J. Zhou, S. Zhang, H. Mei, D. Wang, A method of facial expression recognition based on gabor and nmf, Pattern Recognition and Image Analysis 26 (2016) 119–124

work page 2016

[10] [10]

Abdulrahman, A

M. Abdulrahman, A. Eleyan, Facial expression recognition using sup- port vector machines, in: 2015 23nd signal processing and communica- tions applications conference (SIU), IEEE, 2015, pp. 276–279

work page 2015

[11] [11]

P. P. Thakare, P. S. Patil, Facial expression recognition algorithm based on knn classifier, International Journal of Computer Science and Net- work 5 (2016) 941

work page 2016

[12] [12]

Kayao˘ glu, T

B. Kayao˘ glu, T. Tokta¸ s, S. Kırbız, Cnn-based emotion recognition using data augmentation and preprocessing methods, in: 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), IEEE, 2023, pp. 1–4

work page 2023

[13] [13]

Ezerceli, M

¨O. Ezerceli, M. T. Eskil, Convolutional neural network (cnn) algorithm based facial emotion recognition (fer) system for fer-2013 dataset, in: 2022 International Conference on Electrical, Computer, Communica- tions and Mechatronics Engineering (ICECCME), IEEE, 2022, pp. 1–6

work page 2013

[14] [14]

M. R. A. Borgalli, S. Surve, Deep learning for facial emotion recogni- tion using custom cnn architecture, in: Journal of Physics: Conference Series, volume 2236, IOP Publishing, 2022, p. 012004. 20

work page 2022

[15] [15]

S. Z. Jumani, F. Ali, S. Guriro, I. A. Kandhro, A. Khan, A. Zaidi, Facial expression recognition with histogram of oriented gradients using cnn, Indian Journal of Science and Technology 12 (2019) 1–8

work page 2019

[16] [16]

Lucey, J

P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The extended cohn-kanade dataset (ck+): A complete dataset for ac- tion unit and emotion-specified expression, in: 2010 ieee computer so- ciety conference on computer vision and pattern recognition-workshops, IEEE, 2010, pp. 94–101

work page 2010

[17] [17]

M. G. Calvo, D. Lundqvist, Facial expressions of emotion (kdef): Iden- tification under different display-duration conditions, Behavior research methods 40 (2008) 109–115

work page 2008

[18] [18]

Barsoum, C

E. Barsoum, C. Zhang, C. C. Ferrer, Z. Zhang, Training deep networks for facial expression recognition with crowd-sourced label distribution, in: Proceedings of the 18th ACM international conference on multi- modal interaction, 2016, pp. 279–283

work page 2016

[19] [19]

J. Deng, J. Guo, E. Ververas, I. Kotsia, S. Zafeiriou, Retinaface: Single-shot multi-level face localisation in the wild, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 5203–5212

work page 2020

[20] [20]

E. D. Cubuk, B. Zoph, J. Shlens, Q. V. Le, Randaugment: Practical automated data augmentation with a reduced search space, in: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 702–703

work page 2020

[21] [21]

Shekelyan, G

M. Shekelyan, G. Cormode, P. Triantafillou, A. Shanghooshabad, Q. Ma, Weighted random sampling over joins, arXiv preprint arXiv:2201.02670 (2022)

work page arXiv 2022

[22] [22]

Y. H. Kwon, N. da Vitoria Lobo, Age classification from facial images, Computer vision and image understanding 74 (1999) 1–21

work page 1999

[23] [23]

S. Li, W. Deng, Deep facial expression recognition: A survey, IEEE transactions on affective computing 13 (2020) 1195–1215

work page 2020

[24] [24]

Tolba, A

A. Tolba, A. El-Baz, A. El-Harby, Face recognition: A literature review, International Journal of Signal Processing 2 (2006) 88–103. 21

work page 2006

[25] [25]

M. N. Chaudhari, M. Deshmukh, G. Ramrakhiani, R. Parvatikar, Face detection using viola jones algorithm and neural networks, in: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), IEEE, 2018, pp. 1–6

work page 2018

[26] [26]

Merget, M

D. Merget, M. Rock, G. Rigoll, Robust facial landmark detection via a fully-convolutional local-global context network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 781–790

work page 2018

[27] [27]

T. F. Cootes, G. J. Edwards, C. J. Taylor, Active appearance mod- els, IEEE Transactions on pattern analysis and machine intelligence 23 (2001) 681–685

work page 2001

[28] [28]

T. F. Cootes, C. J. Taylor, D. H. Cooper, J. Graham, Active shape models-their training and application, Computer vision and image un- derstanding 61 (1995) 38–59

work page 1995

[29] [29]

M. Kass, A. Witkin, D. Terzopoulos, Snakes: Active contour models, International journal of computer vision 1 (1988) 321–331

work page 1988

[30] [30]

D. Chen, S. Ren, Y. Wei, X. Cao, J. Sun, Joint cascade fac detection and alignment, in: Computer Vision–EECV 2014: 13th European Con- frence, Zurich, Switzerland, September 6-12, 2014, Proceedings, part VI 13, Springer, 2014, pp. 109–122

work page 2014

[31] [31]

Zhang, Z

K. Zhang, Z. Zhang, Z. Li, Y. Qiao, Joint face detection and alignment using multitask cascaded convolutional networks, IEEE signal process- ing letters 23 (2016) 1499–1503

work page 2016

[32] [32]

K. He, G. Gkioxari, P. Doll´ ar, R. Girshick, Mask r-cnn, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961– 2969

work page 2017

[33] [33]

Chaudhuri, N

B. Chaudhuri, N. Vesdapunt, B. Wang, Joint face detection and facial motion retargeting for multiple faces, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9719–9728

work page 2019

[34] [34]

E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q. V. Le, Autoaug- ment: Learning augmentation strategies from data, in: Proceedings of 22 the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 113–123

work page 2019

[35] [35]

11 Published as a conference paper at ICLR 2025 Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshmi- narayanan

D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, B. Laksh- minarayanan, Augmix: A simple data processing method to improve robustness and uncertainty, arXiv preprint arXiv:1912.02781 (2019)

work page arXiv 1912

[36] [36]

Kırbız, Facial emotion recognition using residual neural networks, Electrica 24 (2024) 818–825

S. Kırbız, Facial emotion recognition using residual neural networks, Electrica 24 (2024) 818–825

work page 2024

[37] [37]

Prakasa, Texture feature extraction by using local binary pattern, INKOM Journal 9 (2016) 45–48

E. Prakasa, Texture feature extraction by using local binary pattern, INKOM Journal 9 (2016) 45–48

work page 2016

[38] [38]

H. Kaya, F. G¨ urpınar, A. A. Salah, Video-based emotion recognition in the wild using deep transfer learning and score fusion, Image and Vision Computing 65 (2017) 66–75

work page 2017

[39] [39]

K. Wang, X. Peng, J. Yang, D. Meng, Y. Qiao, Region attention net- works for pose and occlusion robust facial expression recognition, IEEE Transactions on Image Processing 29 (2020) 4057–4069

work page 2020

[40] [40]

Minaee, M

S. Minaee, M. Minaei, A. Abdolrashidi, Deep-emotion: Facial expression recognition using attentional convolutional network, Sensors 21 (2021) 3046

work page 2021

[41] [41]

Zhang, Y

K. Zhang, Y. Huang, Y. Du, L. Wang, Facial expression recognition based on deep evolutional spatial-temporal networks, IEEE Transactions on Image Processing 26 (2017) 4193–4203

work page 2017

[42] [42]

Z. Wang, K. Zhang, W. Luo, R. Sankaranarayana, Htnet for micro- expression recognition, Neurocomputing 602 (2024) 128196

work page 2024

[43] [43]

W. Niu, K. Zhang, D. Li, W. Luo, Four-player groupgan for weak expression recognition via latent expression magnification, Knowledge- Based Systems 251 (2022) 109304

work page 2022

[44] [44]

S. Yang, P. Luo, C.-C. Loy, X. Tang, Wider face: A face detection benchmark, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525–5533. 23

work page 2016

[45] [45]

Garatti, A

S. Garatti, A. Car` e, M. C. Campi, Complexity is an effective observable to tune early stopping in scenario optimization, IEEE Transactions on Automatic Control 68 (2022) 928–942

work page 2022

[46] [46]

Patwal, M

A. Patwal, M. Diwakar, A. Joshi, P. Singh, Facial expression recognition using densenet, in: 2022 OITS International Conference on Information Technology (OCIT), IEEE, 2022, pp. 548–552

work page 2022

[47] [47]

Utami, R

P. Utami, R. Hartanto, I. Soesanti, The efficientnet performance for facial expressions recognition, in: 2022 5th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 2022, pp. 756–762

work page 2022

[48] [48]

Zhong, J

Z. Zhong, J. Li, L. Ma, H. Jiang, H. Zhao, Deep residual networks for hy- perspectral image classification, in: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), IEEE, 2017, pp. 1824–1827

work page 2017

[49] [49]

T. A. Araf, A. Siddika, S. Karimi, M. G. R. Alam, Real-time face emotion recognition and visualization using grad-cam, in: 2022 Second International Conference on Advances in Electrical, Computing, Com- munication and Sustainable Technologies (ICAECT), IEEE, 2022, pp. 1–5

work page 2022

[50] [50]

Lorch, J

S. Lorch, J. Gebele, P. Brune, Towards trustworthy ai: Evaluating shap and lime for facial emotion recognition (2025). 24

work page 2025