Assessing the Efficacy of Deep Learning Approaches for Facial Expression Recognition in Individuals with Intellectual Disabilities
Pith reviewed 2026-05-24 04:33 UTC · model grok-4.3
The pith
Deep learning models for facial expression recognition require user-specific training to handle the distinct expressions of individuals with intellectual disabilities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Examination of outcomes from training convolutional neural networks on an ensemble of datasets without individuals with intellectual disabilities versus a dataset featuring such individuals reveals significant distinctions in facial expressions, demonstrating the need for tailored user-specific training methodologies that enable models to effectively address the unique expressions of each user.
What carries the argument
Set of 12 convolutional neural networks trained in different approaches including mixed datasets, evaluated through performance and attention map variations to highlight expression differences.
Load-bearing premise
The assumption that observed performance differences and attention map variations are caused by intellectual disability status rather than confounding factors such as dataset composition, age, lighting, or labeling differences between the compared datasets.
What would settle it
Retraining the same 12 networks on new datasets matched for age, lighting, and labeling but differing only by intellectual disability status, then checking if performance gaps and attention map differences disappear.
Figures
read the original abstract
Facial expression recognition has gained significance as a means of imparting social robots with the capacity to discern the emotional states of users. The use of social robotics includes a variety of settings, including homes, nursing homes or daycare centers, serving to a wide range of users. Remarkable performance has been achieved by deep learning approaches, however, its direct use for recognizing facial expressions in individuals with intellectual disabilities has not been yet studied in the literature, to the best of our knowledge. To address this objective, we train a set of 12 convolutional neural networks in different approaches, including an ensemble of datasets without individuals with intellectual disabilities and a dataset featuring such individuals. Our examination of the outcomes, both the performance and the important image regions for the models, reveals significant distinctions in facial expressions between individuals with and without intellectual disabilities, as well as among individuals with intellectual disabilities. Remarkably, our findings show the need of facial expression recognition within this population through tailored user-specific training methodologies, which enable the models to effectively address the unique expressions of each user.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper trains 12 CNNs for facial expression recognition, comparing an ensemble of non-ID datasets against an ID-specific dataset. It reports performance differences and variations in attention maps, concluding that these indicate unique expressions in individuals with intellectual disabilities and that user-specific training methodologies are required to address them effectively.
Significance. If the performance and attention-map differences can be causally attributed to intellectual-disability status rather than dataset confounders, the work would usefully highlight limitations of off-the-shelf FER models for inclusive robotics applications. The study fills a literature gap on this population, but the current experimental design does not isolate the claimed causal factor.
major comments (2)
- [Methods / Experimental setup] The central claim that distinctions require user-specific training rests on attributing performance gaps and attention-map differences to ID status. The experimental comparison (ensemble of non-ID datasets vs. ID dataset) provides no matching, stratification, or covariate adjustment for age, lighting, camera angle, labeling protocol, or expression distribution; without such controls the attribution cannot be isolated and the recommendation for tailored methodologies is not supported.
- [Results] Results section: no quantitative metrics (accuracy, F1, confusion matrices), dataset sizes, statistical tests, or model-configuration details are supplied to allow verification that the reported distinctions are reliable or larger than would be expected from the listed confounders.
minor comments (2)
- [Abstract] Abstract and introduction should explicitly state the sizes and sources of the 12 datasets and the ID corpus so readers can assess comparability.
- [Methods] Clarify how the 12 models were selected and whether hyper-parameters were tuned separately on each corpus; this affects interpretation of the performance comparison.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments on our manuscript. We address each of the major comments below and outline the revisions we plan to make.
read point-by-point responses
-
Referee: [Methods / Experimental setup] The central claim that distinctions require user-specific training rests on attributing performance gaps and attention-map differences to ID status. The experimental comparison (ensemble of non-ID datasets vs. ID dataset) provides no matching, stratification, or covariate adjustment for age, lighting, camera angle, labeling protocol, or expression distribution; without such controls the attribution cannot be isolated and the recommendation for tailored methodologies is not supported.
Authors: We agree that the experimental design does not include explicit controls or adjustments for the potential confounders mentioned. The datasets used are existing collections that reflect practical scenarios in which such models would be deployed. While this limits causal attribution to ID status alone, the consistent differences observed across multiple models and attention maps support our conclusion that off-the-shelf models may not suffice. In the revised manuscript, we will expand the discussion to acknowledge these limitations and emphasize that the recommendation for user-specific training is based on observed performance gaps rather than strict causal isolation. revision: yes
-
Referee: [Results] Results section: no quantitative metrics (accuracy, F1, confusion matrices), dataset sizes, statistical tests, or model-configuration details are supplied to allow verification that the reported distinctions are reliable or larger than would be expected from the listed confounders.
Authors: We acknowledge that the current version of the manuscript lacks sufficient quantitative metrics, dataset sizes, statistical tests, and detailed model configurations in the Results section. We will revise the manuscript to include accuracy, F1 scores, confusion matrices, dataset sizes, statistical tests, and full model details to allow proper verification. revision: yes
Circularity Check
No circularity: purely empirical ML evaluation with independent dataset comparisons
full rationale
The paper trains 12 CNNs on an ensemble of non-ID datasets versus an ID dataset, then reports performance metrics and attention maps. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the load-bearing claims. The central recommendation for user-specific training follows directly from the observed experimental differences rather than reducing to any input by construction. This is a standard empirical study whose results are externally falsifiable via replication on the cited datasets.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Convolutional neural networks can extract discriminative facial features from images for expression classification
Reference graph
Works this paper leans on
-
[1]
B. De Gelder and J. Van den Stock, “The bodily expressive action stimulus test (beast). construction and validation of a stimulus basis for measuring perception of whole body expression of emotions,” Frontiers in Psychology , vol. 2, 2011. [Online]. Available: https://www.frontiersin.org/articles/10.3389/fpsyg.2011.00181
-
[2]
Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements,
L. F. Barrett, R. Adolphs, S. Marsella, A. M. Martinez, and S. D. Pollak, “Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements,” Psychological Science in the Public Interest, vol. 20, no. 1, pp. 1–68, 2019
work page 2019
-
[3]
Guest editorial cogni- tive agents and robots for human-centered systems,
A. Di Nuovo, G. Acampora, and M. Schlesinger, “Guest editorial cogni- tive agents and robots for human-centered systems,” IEEE Transactions on Cognitive and Developmental Systems , vol. 9, no. 1, pp. 1–4, 2017
work page 2017
-
[4]
Conversational affective social robots for ageing and dementia support,
M. R. Lima, M. Wairagkar, M. Gupta, F. Rodriguez y Baena, P. Bar- naghi, D. J. Sharp, and R. Vaidyanathan, “Conversational affective social robots for ageing and dementia support,” IEEE Transactions on Cognitive and Developmental Systems , vol. 14, no. 4, pp. 1378–1397, 2022
work page 2022
-
[5]
A brief review of facial emotion recognition based on visual information,
B. C. Ko, “A brief review of facial emotion recognition based on visual information,” Sensors, vol. 18, no. 2, 2018
work page 2018
-
[6]
A survey on human face expression recognition techniques,
I. Revina and W. S. Emmanuel, “A survey on human face expression recognition techniques,” Journal of King Saud University - Computer and Information Sciences , vol. 33, no. 6, pp. 619–628, 2021
work page 2021
-
[7]
Deep facial expression recognition: A survey,
S. Li and W. Deng, “Deep facial expression recognition: A survey,” IEEE Transactions on Affective Computing , vol. 13, no. 3, pp. 1195– 1215, 2022
work page 2022
-
[8]
An argument for basic emotions,
P. Ekman, “An argument for basic emotions,” Cognition and Emotion , vol. 6, no. 3-4, pp. 169–200, 1992
work page 1992
-
[9]
Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements,
L. F. Barrett, R. Adolphs, S. Marsella, A. M. Martinez, and S. D. Pollak, “Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements,” Psychological Science in the Public Interest, vol. 20, no. 1, pp. 1–68, 2019
work page 2019
-
[10]
Emotional expression in psychiatric conditions: New tech- nology for clinicians,
K. Grabowski, A. Rynkiewicz, A. Lassalle, S. Baron-Cohen, B. Schuller, N. Cummins, A. Baird, J. Podg ´orska-Bednarz, A. Pienia ¸ ˙zek, and I. Łucka, “Emotional expression in psychiatric conditions: New tech- nology for clinicians,” Psychiatry and Clinical Neurosciences , vol. 73, no. 2, pp. 50–62, 2019
work page 2019
-
[11]
S. Medjden, N. Ahmed, and M. Lataifeh, “Adaptive user interface design and analysis using emotion recognition through facial expressions and body posture from an rgb-d sensor,” PLoS ONE , vol. 15, no. 7, p. e0235908, 2020
work page 2020
-
[12]
Using a social robot to evaluate facial expressions in the wild,
S. Ramis, J. M. Buades, and F. J. Perales, “Using a social robot to evaluate facial expressions in the wild,” Sensors, vol. 20, no. 23, 2020
work page 2020
-
[13]
Automatic analysis of facial expressions: the state of the art,
M. Pantic and L. J. M. Rothkrantz, “Automatic analysis of facial expressions: the state of the art,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 22, no. 12, pp. 1424–1445, 2000
work page 2000
-
[14]
Automatic facial expression analysis: a survey,
B. Fasel and J. Luettin, “Automatic facial expression analysis: a survey,” Pattern Recognition, vol. 36, no. 1, pp. 259–275, 2003
work page 2003
-
[15]
G. Murray, K. McKenzie, A. Murray, K. Whelan, J. Cossar, K. Murray, and J. Scotland, “The impact of contextual information on the emotion recognition of children with an intellectual disability,” Journal of Applied Research in Intellectual Disabilities , vol. 32, no. 1, pp. 152–158, 2019. [Online]. Available: https://onlinelibrary.wiley.com/doi/ abs/10.11...
-
[16]
D. Adams and C. Oliver, “The expression and assessment of emotions and internal states in individuals with severe or profound intellectual disabilities,” Clinical Psychology Review , vol. 31, no. 3, pp. 293– 306, 2011. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S0272735811000080
work page 2011
-
[17]
World Health Organization (WHO), International Classification of Func- tioning, Disability and Health (ICF) , 2018
work page 2018
-
[18]
Facial emotion recognition in intellectual disabilities,
R. H. Zaja and J. Rojahn, “Facial emotion recognition in intellectual disabilities,” Current Opinion in Psychiatry , vol. 21, no. 5, 2008. [Online]. Available: https://journals.lww.com/co-psychiatry/ fulltext/2008/09000/facial emotion recognition in intellectual.3.aspx
work page 2008
-
[19]
T. Rayworth, “Teaching Children With Mild to Moderate Intellectual Disabilities to Select and Produce Facial Expressions of Emotion Using Modelling and Feedback,” Ph.D. dissertation, Edith Cowan University, 2997
-
[20]
Facial emotional expressions of adults with mental retardation,
F. L. Wilczenski, “Facial emotional expressions of adults with mental retardation,” Education and Training in Mental Retardation , vol. 26, no. 3, pp. 319–324, 1991. [Online]. Available: http: //www.jstor.org/stable/23878619
-
[21]
Facial emotion recognition using deep learning: review and insights,
W. Mellouk and W. Handouzi, “Facial emotion recognition using deep learning: review and insights,” Procedia Computer Science, vol. 175, pp. 689–694, 2020, the 17th International Conference on Mobile Systems and Pervasive Computing (MobiSPC),The 15th International Conference on Future Networks and Communications (FNC),The 10th International Conference on S...
work page 2020
-
[22]
C. Campomanes- ´Alvarez and B. R. Campomanes- ´Alvarez, “Automatic facial expression recognition for the interaction of individuals with multiple disabilities,” in 2021 International Conference on Applied Artificial Intelligence (ICAPAI), 2021, pp. 1–6
work page 2021
-
[23]
Recognition of behaviour patterns for people with profound intellectual and multiple disabilities,
E. Dovgan, J. Vali ˇc, G. Slapni ˇcar, and M. Lu ˇstrek, “Recognition of behaviour patterns for people with profound intellectual and multiple disabilities,” in Adjunct Proceedings of the 2021 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2021 ACM International Symposium on Wearable Computers, ser. UbiComp...
-
[24]
Facial expression in adults with Down’s Syndrome
M. C. Smith and D. G. Dodson, “Facial expression in adults with Down’s Syndrome.” US, pp. 602–608, 1996
work page 1996
-
[25]
N. Paredes, E. Caicedo-Bravo, and B. Bacca, “Emotion recognition in individuals with down syndrome: A convolutional neural network-based algorithm proposal,” Symmetry, vol. 15, no. 7, p. 1435, Jul. 2023. [Online]. Available: http://dx.doi.org/10.3390/sym15071435
-
[26]
P. Lucey, J. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression,” 07 2010, pp. 94–101
work page 2010
-
[27]
A 3d facial expression database for facial behavior research,
L. Yin, X. Wei, Y . Sun, J. Wang, and M. Rosato, “A 3d facial expression database for facial behavior research,” vol. 2006, 05 2006, pp. 211–216
work page 2006
-
[28]
The Japanese Female Facial Expression (JAFFE) Dataset,
M. Lyons, M. Kamachi, and J. Gyoba, “The Japanese Female Facial Expression (JAFFE) Dataset,” Apr. 1998, The images are provided at 10 no cost for non- commercial scientific research only. If you agree to the conditions listed below, you may request access to download
work page 1998
-
[29]
M. Olszanowski, G. Pochwatko, K. Kuklinski, M. Scibor-Rylski, P. Lewinski, and R. Ohme, “Warsaw set of emotional facial expression pictures: A validation study of facial display photographs,” Frontiers in Psychology, vol. 5, 12 2014
work page 2014
-
[30]
A novel approach to cross dataset studies in facial expression recognition,
S. Ramis, J. M. Buades, F. J. Perales, and C. Manresa-Yee, “A novel approach to cross dataset studies in facial expression recognition,” Multimedia Tools Appl., vol. 81, no. 27, p. 39507–39544, nov 2022
work page 2022
-
[31]
Muderi: Mul- timodal database for emotion recognition among intellectually disabled individuals,
J. Shukla, M. Barreda- ´Angeles, J. Oliver, and D. Puig, “Muderi: Mul- timodal database for emotion recognition among intellectually disabled individuals,” 11 2016
work page 2016
-
[32]
A contrario detection of faces: A case example,
J.-L. Lisani, S. Ramis, and F. J. Perales, “A contrario detection of faces: A case example,” SIAM Journal on Imaging Sciences , vol. 10, no. 4, pp. 2091–2118, 2017
work page 2091
-
[33]
300 faces in-the-wild challenge: The first facial landmark localization challenge,
C. Sagonas, G. Tzimiropoulos, S. Zafeiriou, and M. Pantic, “300 faces in-the-wild challenge: The first facial landmark localization challenge,” 12 2013, pp. 397–403
work page 2013
-
[34]
Imagenet classifica- tion with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifica- tion with deep convolutional neural networks,” in Advances in Neural Information Processing Systems , F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25. Curran Associates, Inc., 2012
work page 2012
-
[35]
Very deep convolutional networks for large-scale image recognition,
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015
work page 2015
-
[36]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2015
work page 2015
-
[37]
Rethinking the inception architecture for computer vision,
C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” 2015
work page 2015
-
[38]
Xception: Deep learning with depthwise separable convolu- tions,
F. Chollet, “Xception: Deep learning with depthwise separable convolu- tions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
work page 2017
-
[39]
A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y . Zhu, R. Pang, V . Vasudevan, Q. V . Le, and H. Adam, “Searching for mobilenetv3,” 2019
work page 2019
-
[40]
Efficientnetv2: Smaller models and faster training,
M. Tan and Q. V . Le, “Efficientnetv2: Smaller models and faster training,” 2021
work page 2021
-
[41]
Deep learning for real-time robust facial expression recognition on a smartphone,
I. Song, H.-J. Kim, and P. B. Jeon, “Deep learning for real-time robust facial expression recognition on a smartphone,” in 2014 IEEE International Conference on Consumer Electronics (ICCE) , 2014, pp. 564–567
work page 2014
-
[42]
A deep-learning approach to facial expression recognition with candid images,
W. Li, M. Li, Z. Su, and Z. Zhu, “A deep-learning approach to facial expression recognition with candid images,” in 2015 14th IAPR International Conference on Machine Vision Applications (MVA). IEEE, 2015, pp. 279–282
work page 2015
-
[43]
C. Manresa-Yee, S. Ramis, and J. M. Buades, “Analysis of Gender Differences in Facial Expression Recognition Based on Deep Learn- ing Using Explainable Artificial Intelligence,” International Journal of Interactive Multimedia and Artificial Intelligence (In press)
-
[44]
M. T. Ribeiro, S. Singh, and C. Guestrin, “”why should i trust you?”: Explaining the predictions of any classifier,” 2016
work page 2016
-
[45]
Slic superpixels compared to state-of-the-art superpixel methods,
R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S ¨usstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,”IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 34, no. 11, pp. 2274–2282, 2012
work page 2012
-
[46]
Explainable facial expression recognition for people with intellectual disabilities,
S. Ramis, C. Manresa-Yee, J. M. Buades-Rubio, and F. X. Gaya-Morey, “Explainable facial expression recognition for people with intellectual disabilities,” in XXIII International Conference on Human Computer In- teraction (Interaccion 2023). Lleida, Spain: Association for Computing Machinery, September 2023. VII. B IOGRAPHY SECTION F. Xavier Gaya-Morey F. ...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.