A cross-modal network for facial expression recognition
Pith reviewed 2026-05-08 18:36 UTC · model grok-4.3
The pith
CMNet recognizes facial expressions by combining symmetric features from whole and half faces.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CMNet can respectively learn expression information via face symmetry on a whole face, left and right half faces to extract complementary facial features. To prevent negative effect of biological and structural information fusion, a salient facial information refinement module can obtain salient facial expression information to improve stability of an obtained facial expression classifier. To reduce reliance on unilateral facial features, a half-face alignment optimization mechanism is designed to align obtained expression information of learned left and right half faces. Experimental results demonstrate that CMNet outperforms SCN and LAENet-SA for facial expression recognition.
What carries the argument
Cross-modal network (CMNet) with salient facial information refinement module and half-face alignment optimization mechanism that processes whole-face and half-face inputs symmetrically to extract complementary features.
Load-bearing premise
That fusing biological and structural information from whole and half faces via the salient facial information refinement module and half-face alignment optimization mechanism does not produce negative effects and instead improves stability and performance of the obtained facial expression classifier.
What would settle it
A direct comparison on standard facial expression benchmarks showing that CMNet does not exceed the accuracy of SCN or LAENet-SA would falsify the outperformance claim.
Figures
read the original abstract
Deep neural networks enriched with structural information have been widely employed for facial expression recognition tasks. However, these methods often depend on hierarchical information rather than face property to finish expression recognition. In this paper, we propose a cross-modal network with strong biological and structural information for facial expression recognition (CMNet). CMNet can respectively learn expression information via face symmetry on a whole face, left and right half faces to extract complementary facial features. To prevent negative effect of biological and structural information fusion, a salient facial information refinement module can obtain salient facial expression information to improve stability of an obtained facial expression classifier. To reduce reliance on unilateral facial features, a half-face alignment optimization mechanism is designed to align obtained expression information of learned left and right half faces. Our experimental results demonstrate that CMNet outperforms several novel methods, i.e., SCN and LAENet-SA for facial expression recognition. Codes can be obtained at https://github.com/hellloxiaotian/CMNet.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CMNet, a cross-modal network for facial expression recognition that processes whole-face, left-half-face, and right-half-face inputs to exploit symmetry for complementary expression features. It introduces a salient facial information refinement module to extract salient information and avoid negative fusion effects, plus a half-face alignment optimization mechanism to align half-face features and reduce unilateral reliance. The central empirical claim is that CMNet outperforms SCN and LAENet-SA.
Significance. If substantiated by controlled experiments, the incorporation of explicit biological priors (symmetry) and structural fusion mechanisms could offer a practical route to more stable FER models. The public code release aids reproducibility.
major comments (3)
- [Abstract] Abstract: the claim that CMNet 'outperforms several novel methods, i.e., SCN and LAENet-SA' is presented without any mention of datasets, training protocols, ablation studies, statistical tests, or error bars, so it is impossible to attribute gains to the proposed modules rather than uncontrolled factors.
- [Method] Method section (salient facial information refinement module): the assertion that this module 'can obtain salient facial expression information to improve stability' and 'prevent negative effect of biological and structural information fusion' is load-bearing for the central claim, yet no ablation (full CMNet vs. variant lacking the module) or capacity-matched baseline is reported.
- [Method] Method section (half-face alignment optimization mechanism): the claim that the mechanism 'align[s] obtained expression information of learned left and right half faces' and thereby reduces unilateral reliance lacks supporting controlled experiments that would demonstrate it mitigates negative fusion rather than simply adding parameters.
minor comments (1)
- [Abstract] Abstract: the phrasing 'CMNet can respectively learn expression information via face symmetry on a whole face, left and right half faces' is unclear and should be reworded for precision.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. We agree that the abstract and experimental validation can be strengthened for clarity and rigor. We will revise the manuscript accordingly by expanding the abstract with experimental details and adding targeted ablation studies for the proposed modules.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that CMNet 'outperforms several novel methods, i.e., SCN and LAENet-SA' is presented without any mention of datasets, training protocols, ablation studies, statistical tests, or error bars, so it is impossible to attribute gains to the proposed modules rather than uncontrolled factors.
Authors: We agree that the abstract should provide more context to support the performance claim. In the revised version, we will update the abstract to explicitly mention the datasets (RAF-DB and FER2013), training protocols (including data augmentation and optimization details), reference to ablation studies in Section 4, and note that results include error bars with statistical significance testing. These details are already present in the experimental section and will now be summarized in the abstract to better attribute improvements to the cross-modal design and modules. revision: yes
-
Referee: [Method] Method section (salient facial information refinement module): the assertion that this module 'can obtain salient facial expression information to improve stability' and 'prevent negative effect of biological and structural information fusion' is load-bearing for the central claim, yet no ablation (full CMNet vs. variant lacking the module) or capacity-matched baseline is reported.
Authors: We acknowledge that a direct ablation isolating the salient facial information refinement module would provide stronger evidence. The current manuscript demonstrates overall superiority over SCN and LAENet-SA, but to directly address this point we will add a new ablation study in the revised experiments section: comparing full CMNet against a variant without the refinement module, plus a capacity-matched baseline (e.g., by adjusting channel dimensions to equalize parameters). This will quantify the module's contribution to stability and negative fusion prevention. revision: yes
-
Referee: [Method] Method section (half-face alignment optimization mechanism): the claim that the mechanism 'align[s] obtained expression information of learned left and right half faces' and thereby reduces unilateral reliance lacks supporting controlled experiments that would demonstrate it mitigates negative fusion rather than simply adding parameters.
Authors: We agree that controlled experiments are necessary to isolate the effect of the half-face alignment optimization mechanism. While the overall results support reduced unilateral reliance through the cross-modal design, we will add an ablation in the revision: full CMNet versus a variant without the alignment mechanism, including metrics on feature alignment (e.g., cosine similarity between left/right features) and performance under asymmetric conditions. This will demonstrate that the mechanism mitigates negative fusion beyond mere parameter addition. revision: yes
Circularity Check
No circularity: empirical architecture validated by external comparisons
full rationale
The paper proposes CMNet, a cross-modal network using whole-face and half-face symmetry to extract complementary features, with two custom modules (salient facial information refinement and half-face alignment optimization) to mitigate fusion issues. All load-bearing claims are supported by end-to-end experimental accuracy gains against SCN and LAENet-SA on standard benchmarks. No equations, fitted parameters renamed as predictions, self-citations forming uniqueness theorems, or ansatzes smuggled via prior work appear in the derivation chain. The architecture choices are presented as design decisions justified by biological intuition and then tested empirically, with no reduction of outputs to inputs by construction. This is a standard empirical DL proposal whose validity hinges on reproducible experiments rather than internal self-reference.
Axiom & Free-Parameter Ledger
free parameters (1)
- Module design choices and hyperparameters
axioms (2)
- domain assumption Facial expressions are reliably encoded in symmetric and half-face structural information
- domain assumption Fusing multi-view facial features improves classifier stability when properly refined
invented entities (2)
-
Salient facial information refinement module
no independent evidence
-
Half-face alignment optimization mechanism
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith.Cost (Jcost = ½(x + x⁻¹) − 1)Cost.washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
L_sl = (2/NC) Σ (x_l − x_r)^2 ... α is set to 0.9 ... balance two losses
-
Foundation.AlphaCoordinateFixation (RS chain has zero adjustable parameters)alphaCoordinateFixationCert unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
α is set to 0.9 in this paper ... initial learning rate of 0.01 ... batch size of 32
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Expression systems: Editorial overview,
A. R. Shatzman, “Expression systems: Editorial overview,” Curr . Opin. Biotechnol. , vol. 4, no. 5, pp. 517–519, 1993
work page 1993
-
[2]
Predicting personalized image emotion perceptions in social networks,
S. Zhao, H. Y ao, Y . Gao, G. Ding, and T.-S. Chua, “Predicting personalized image emotion perceptions in social networks,” IEEE Trans. Affective Comput., vol. 9, no. 4, pp. 526–540, 2016
work page 2016
-
[3]
To- ward label-efficient emotion and sentiment analysis,
S. Zhao, X. Hong, J. Y ang, Y . Zhao, and G. Ding, “To- ward label-efficient emotion and sentiment analysis,” Proc. IEEE , vol. 111, no. 10, pp. 1159–1197, 2023
work page 2023
-
[4]
Constants across cultures in the face and emotion.,
P . Ekman and W. V . Friesen, “Constants across cultures in the face and emotion.,” J. Pers. Soc. Psychol., vol. 17, no. 2, p. 124, 1971
work page 1971
-
[5]
Attention mechanisms in computer vision: A survey,
M.-H. Guo et al., “Attention mechanisms in computer vision: A survey,” Comput. Visual Media , vol. 8, no. 3, pp. 331–368, 2022
work page 2022
-
[6]
Region attention networks for pose and occlusion ro- bust facial expression recognition,
K. Wang, X. Peng, J. Y ang, D. Meng, and Y . Qiao, “Region attention networks for pose and occlusion ro- bust facial expression recognition,” IEEE Trans. Image Process., vol. 29, pp. 4057–4069, 2020
work page 2020
-
[7]
Light attention embedding for facial expression recognition,
C. Wang, J. Xue, K. Lu, and Y . Y an, “Light attention embedding for facial expression recognition,” IEEE Trans. Circuits Syst. Video Technol. , vol. 32, no. 4, pp. 1834–1847, 2021
work page 2021
-
[8]
Facial expression recogni- tion in the wild via deep attentive center loss,
A. H. Farzaneh and X. Qi, “Facial expression recogni- tion in the wild via deep attentive center loss,” in Proc. IEEE Winter Conf. Comput. Vis. Appl. (WACV) , Virtual, Jan. 2021, pp. 2402–2411
work page 2021
-
[9]
Z. Zhao, Q. Liu, and S. Wang, “Learning deep global multi-scale and local attention features for facial ex- pression recognition in the wild,” IEEE Trans. Image Process., vol. 30, pp. 6544–6556, 2021
work page 2021
-
[10]
Occlusion aware facial expression recognition using CNN with attention mechanism,
Y . Li, J. Zeng, S. Shan, and X. Chen, “Occlusion aware facial expression recognition using CNN with attention mechanism,” IEEE Trans. Image Process. , vol. 28, no. 5, pp. 2439–2450, 2018
work page 2018
-
[11]
Affective image content analysis: Two decades review and new perspectives,
S. Zhao et al., “Affective image content analysis: Two decades review and new perspectives,” IEEE Trans. Pat- tern Anal. Mach. Intell. , vol. 44, no. 10, pp. 6729–6751, 2021
work page 2021
-
[12]
Coding facial expressions with gabor wavelets,
M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with gabor wavelets,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), 1998, pp. 200–205
work page 1998
-
[13]
Ica and gabor representation for facial expression recognition,
I. Buciu, I. Pitas, et al., “Ica and gabor representation for facial expression recognition,” in Proc. Int. Conf. Image Process. (ICIP) , vol. 2, 2003, pp. II–855
work page 2003
-
[14]
Sparse representation for accurate classi- fication of corrupted and occluded facial expressions,
S. F. Cotter, “Sparse representation for accurate classi- fication of corrupted and occluded facial expressions,” in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2010, pp. 838–841
work page 2010
-
[15]
Y . Ouyang, N. Sang, and R. Huang, “Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers,” Neurocomput- ing, vol. 149, pp. 71–78, 2015
work page 2015
-
[16]
Selective transfer machine for personalized facial expression anal- ysis,
W.-S. Chu, F. De la Torre, and J. F. Cohn, “Selective transfer machine for personalized facial expression anal- ysis,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 39, no. 3, pp. 529–545, 2016
work page 2016
-
[17]
N. Zheng, X. Guo, L. Qi, and L. Guan, “Two- dimensional discriminant multi-manifolds locality pre- serving projection for facial expression recognition,” in Proc. Int. Symp. Circuits Syst. (ISCAS) , 2015, pp. 2065– 2068
work page 2015
-
[18]
Facial expression recognition using distance and shape signature features,
A. Barman and P . Dutta, “Facial expression recognition using distance and shape signature features,” Pattern Recognit. Lett. , vol. 145, pp. 254–261, 2021
work page 2021
-
[19]
M. Sajjad et al., “A comprehensive survey on deep facial expression recognition: Challenges, applications, and future guidelines,” Alexandria Eng. J. , vol. 68, pp. 817–840, 2023
work page 2023
-
[20]
Adaptive weighting of handcrafted feature losses for facial expression recog- nition,
W. Xie, L. Shen, and J. Duan, “Adaptive weighting of handcrafted feature losses for facial expression recog- nition,” IEEE Trans. Cybern. , vol. 51, no. 5, pp. 2787– 2800, 2019
work page 2019
-
[21]
La-net: Landmark-aware learning for reliable facial expression recognition under label noise,
Z. Wu and J. Cui, “La-net: Landmark-aware learning for reliable facial expression recognition under label noise,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV) , 2023, pp. 20 698–20 707
work page 2023
-
[22]
A perception cnn for facial expression recognition,
C. Tian, J. Xie, L. Li, W. Zuo, Y . Zhang, and D. Zhang, “A perception cnn for facial expression recognition,” IEEE Trans. Image Process. , vol. 34, pp. 8101–8113, 2025
work page 2025
-
[23]
Fa- cial expression recognition through cross-modality at- tention fusion,
R. Ni, B. Y ang, X. Zhou, A. Cangelosi, and X. Liu, “Fa- cial expression recognition through cross-modality at- tention fusion,” IEEE Trans. Cognit. Dev. Syst. , vol. 15, no. 1, pp. 175–185, 2022
work page 2022
-
[24]
Feature decomposition and reconstruction learning for effective facial expression recognition,
D. Ruan, Y . Y an, S. Lai, Z. Chai, C. Shen, and H. Wang, “Feature decomposition and reconstruction learning for effective facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 7660–7669
work page 2021
-
[25]
Y . Huang et al., “FERMixNet: An occlusion robust facial expression recognition model with facial mixing augmentation and mid-level representation learning,” IEEE Trans. Affective Comput. , 2024
work page 2024
-
[26]
Learning informative and discriminative features for facial expression recognition in the wild,
Y . Li et al., “Learning informative and discriminative features for facial expression recognition in the wild,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 5, pp. 3178–3189, 2021
work page 2021
-
[27]
Cmdvit: A voluntary facial expression recognition model for complex mental disorders,
J. Y e et al., “Cmdvit: A voluntary facial expression recognition model for complex mental disorders,” IEEE Trans. Image Process. , 2025
work page 2025
-
[28]
Co-attentive multi-task convolu- tional neural network for facial expression recognition,
W. Y u and H. Xu, “Co-attentive multi-task convolu- tional neural network for facial expression recognition,” Pattern Recognit., vol. 123, p. 108 401, 2022
work page 2022
-
[29]
Y . Gao et al., “JADFER: Exploring spatial-contextual interaction with joint attention dropping for facial ex- 12 pression recognition,” IEEE Trans. Affective Comput. , 2024
work page 2024
-
[30]
Mhan: Multi-head hybrid attention net- work for facial expression recognition,
X. Wang et al., “Mhan: Multi-head hybrid attention net- work for facial expression recognition,” Pattern Recog- nit., vol. 170, p. 112 015, 2026
work page 2026
-
[31]
Multi- relations aware network for in-the-wild facial expres- sion recognition,
D. Chen, G. Wen, H. Li, R. Chen, and C. Li, “Multi- relations aware network for in-the-wild facial expres- sion recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 8, pp. 3848–3859, 2023
work page 2023
-
[32]
Relation-aware facial expression recognition,
Y . Xia, H. Y u, X. Wang, M. Jian, and F.-Y . Wang, “Relation-aware facial expression recognition,” IEEE Trans. Cognit. Dev. Syst., vol. 14, no. 3, pp. 1143–1154, 2021
work page 2021
-
[33]
Adaptive multilayer perceptual attention network for facial ex- pression recognition,
H. Liu, H. Cai, Q. Lin, X. Li, and H. Xiao, “Adaptive multilayer perceptual attention network for facial ex- pression recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 9, pp. 6253–6266, 2022
work page 2022
-
[34]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) , Las V egas, Nevada, USA, Jun. 2016, pp. 770–778
work page 2016
-
[35]
CBAM: Convolutional block attention module,
S. Woo, J. Park, J.-Y . Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in 2018 Proc. Eur . Conf. Comput. Vis. (ECCV) , Berlin, Heidelberg: Springer-V erlag, 2018, pp. 3–19
work page 2018
-
[36]
J. S. Bridle, “Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition,” in Neurocomputing, F. F. Soulié and J. Hérault, Eds., Berlin, Heidelberg, 1990, pp. 227–236, ISBN : 978-3-642-76153-9
work page 1990
-
[37]
Context- aware emotion recognition networks,
J. Lee, S. Kim, S. Kim, J. Park, and K. Sohn, “Context- aware emotion recognition networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV) , Seoul, South Korea, 2019, pp. 10 143–10 152
work page 2019
-
[38]
Challenges in representation learning: A report on three machine learning contests,
I. J. Goodfellow et al., “Challenges in representation learning: A report on three machine learning contests,” Neural Networks , pp. 117–124, 2013
work page 2013
-
[39]
Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,
S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , Honolulu, Hawaii, USA, Jun. 2017, pp. 2852–2861
work page 2017
-
[40]
Af- fectnet: A database for facial expression, valence, and arousal computing in the wild,
A. Mollahosseini, B. Hasani, and M. H. Mahoor, “Af- fectnet: A database for facial expression, valence, and arousal computing in the wild,” IEEE Trans. Affective Comput., vol. 10, no. 1, pp. 18–31, 2017
work page 2017
-
[41]
Video and image based emotion recogni- tion challenges in the wild: Emotiw 2015,
A. Dhall, O. Ramana Murthy, R. Goecke, J. Joshi, and T. Gedeon, “Video and image based emotion recogni- tion challenges in the wild: Emotiw 2015,” in Proc. ACM Int. Conf. Multimodal Interaction ACM ICMI , Seattle, Washington, USA, Nov. 2015, pp. 423–426
work page 2015
-
[42]
Acted facial expressions in the wild database,
A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Acted facial expressions in the wild database,” ANU Tech. Rep. TR-CS-11, vol. 2, no. 1, 2011
work page 2011
-
[43]
Ms- celeb-1m: A dataset and benchmark for large-scale face recognition,
Y . Guo, L. Zhang, Y . Hu, X. He, and J. Gao, “Ms- celeb-1m: A dataset and benchmark for large-scale face recognition,” in Proc. Eur . Conf. Comput. Vis. (ECCV) , Amsterdam, the Netherlands, Oct. 2016, pp. 87–102
work page 2016
-
[44]
Retinaface: Single-shot multi-level face localisation in the wild,
J. Deng, J. Guo, E. V erveras, I. Kotsia, and S. Zafeiriou, “Retinaface: Single-shot multi-level face localisation in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , Virtual, Jun. 2020, pp. 5203–5212
work page 2020
-
[45]
Adam: A Method for Stochastic Optimization
D. P . Kingma and J. Ba, “Adam: A method for stochas- tic optimization,” arXiv:1412.6980, 2014
work page internal anchor Pith review arXiv 2014
-
[46]
Distract your attention: Multi-head cross attention network for facial expression recognition,
Z. Wen, W. Lin, T. Wang, and G. Xu, “Distract your attention: Multi-head cross attention network for facial expression recognition,” Biomimetics, vol. 8, no. 2, p. 199, 2023
work page 2023
-
[47]
A stochastic approximation method,
H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Stat. , pp. 400–407, 1951
work page 1951
-
[48]
Gradient-based learning applied to document recog- nition,
Y . LeCun, L. Bottou, Y . Bengio, and P . Haffner, “Gradient-based learning applied to document recog- nition,” Proc. IEEE , vol. 86, no. 11, pp. 2278–2324, 2002
work page 2002
-
[49]
Y . Xu, X. Zhu, Z. Li, G. Liu, Y . Lu, and H. Liu, “Using the original and symmetrical facetraining samples to perform representation based two-step face recogni- tion,” Pattern Recognit., vol. 46, no. 4, pp. 1151–1158, 2013
work page 2013
-
[50]
Grad-cam++: Generalized gradient- based visual explanations for deep convolutional net- works,
A. Chattopadhay, A. Sarkar, P . Howlader, and V . N. Balasubramanian, “Grad-cam++: Generalized gradient- based visual explanations for deep convolutional net- works,” in Proc. IEEE Winter Conf. Comput. Vis. Appl. (WACV), Nevada, USA: IEEE, Mar. 2018, pp. 839–847
work page 2018
-
[51]
Pose-adaptive hi- erarchical attention network for facial expression recog- nition,
Y . Liu, J. Peng, J. Zeng, and S. Shan, “Pose-adaptive hi- erarchical attention network for facial expression recog- nition,” arXiv:1905.10059, 2019
-
[52]
Robust lightweight facial expression recognition network with label distribution training,
Z. Zhao, Q. Liu, and F. Zhou, “Robust lightweight facial expression recognition network with label distribution training,” in AAAI Conf. Artif. Intell. , Issue: 4, vol. 35, Virtual, Feb. 2021, pp. 3510–3519
work page 2021
-
[53]
Facial expression recognition with inconsistently annotated datasets,
J. Zeng, S. Shan, and X. Chen, “Facial expression recognition with inconsistently annotated datasets,” in Proc. Eur . Conf. Comput. Vis. (ECCV) , Munich, Ger- many, Sep. 2018, pp. 222–237
work page 2018
-
[54]
Sup- pressing uncertainties for large-scale facial expression recognition,
K. Wang, X. Peng, J. Y ang, S. Lu, and Y . Qiao, “Sup- pressing uncertainties for large-scale facial expression recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , Virtual, Jun. 2020, pp. 6897–6906
work page 2020
-
[55]
Cnn-based facial affect anal- ysis on mobile devices,
C. Hewitt and H. Gunes, “Cnn-based facial affect anal- ysis on mobile devices,” arXiv:1807.08775, 2018
-
[56]
C. Li, X. Li, X. Wang, D. Huang, Z. Liu, and L. Liao, “FG-AGR: Fine-grained associative graph representa- tion for facial expression recognition in the wild,” IEEE Trans. Circuits Syst. Video Technol. , vol. 34, no. 2, pp. 882–896, 2023, Publisher: IEEE
work page 2023
-
[57]
Efficient fa- cial feature learning with wide ensemble-based con- volutional neural networks,
H. Siqueira, S. Magg, and S. Wermter, “Efficient fa- cial feature learning with wide ensemble-based con- volutional neural networks,” in AAAI Conf. Artif. In- tell., Issue: 04, vol. 34, New Y ork, USA, Feb. 2020, pp. 5800–5809
work page 2020
-
[58]
Z. Dong et al., “FE-SpikeFormer: A camera-based fa- cial expression recognition method for hospital health monitoring,” IEEE J. Biomed. Health. Inf. , pp. 1–11, 2025. 13
work page 2025
-
[59]
Unconstrained facial expression recognition with no- reference de-elements learning,
H. Li, N. Wang, X. Y ang, X. Wang, and X. Gao, “Unconstrained facial expression recognition with no- reference de-elements learning,” IEEE Trans. Affective Comput., vol. 15, no. 1, pp. 173–185, 2024
work page 2024
-
[60]
Learning a facial expression embedding disentangled from identity,
W. Zhang, X. Ji, K. Chen, Y . Ding, and C. Fan, “Learning a facial expression embedding disentangled from identity,” inProc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , Virtual, Jun. 2021, pp. 6759–6768
work page 2021
-
[61]
H. Liu et al., “MMA Trans: Muscle movement aware representation learning for facial expression recognition via transformers,” IEEE Trans. Ind. Inf. , 2024
work page 2024
-
[62]
Learn from all: Erasing attention consistency for noisy label facial expression recognition,
Y . Zhang, C. Wang, X. Ling, and W. Deng, “Learn from all: Erasing attention consistency for noisy label facial expression recognition,” in 2022 Proc. Eur . Conf. Comput. Vis. (ECCV) , Tel Aviv, Israel, Oct. 2022, pp. 418–434
work page 2022
-
[63]
Face2exp: Combating data biases for facial expression recognition,
D. Zeng, Z. Lin, X. Y an, Y . Liu, F. Wang, and B. Tang, “Face2exp: Combating data biases for facial expression recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , New Orleans, Louisiana, USA, Jun. 2022, pp. 20 291–20 300
work page 2022
-
[64]
Adap- tively learning facial expression representation via c- f labels and distillation,
H. Li, N. Wang, X. Ding, X. Y ang, and X. Gao, “Adap- tively learning facial expression representation via c- f labels and distillation,” IEEE Trans. Image Process. , vol. 30, pp. 2016–2028, 2021
work page 2016
-
[65]
Q. Y ang, Y . He, H. Chen, Y . Wu, and Z. Rao, “A novel lightweight facial expression recognition network based on deep shallow network fusion and attention mechanism,” Algorithms, vol. 18, no. 8, 2025
work page 2025
-
[66]
Q. Li, Z. Liu, Z. Zhang, Q. Wang, and M. Ma, “Decoding group emotional dynamics in a web-based collaborative environment: A novel framework utiliz- ing multi-person facial expression recognition,” Int. J. Hum.-Comput. Interact., vol. 41, no. 5, pp. 3455–3473, 2025
work page 2025
-
[67]
M. Najmabadi, M. Masoudifar, and A. Hajipour, “Weighted classification of deep and traditional histogram-based features with kernel representation for robust facial expression recognition,” Appl. Soft Com- put., vol. 182, p. 113 630, 2025
work page 2025
-
[68]
Facial expression recogni- tion with visual transformers and attentional selective fusion,
F. Ma, B. Sun, and S. Li, “Facial expression recogni- tion with visual transformers and attentional selective fusion,” IEEE Trans. Affective Comput. , vol. 14, no. 2, pp. 1236–1248, 2021, Publisher: IEEE
work page 2021
-
[69]
Learning vision transformer with squeeze and excitation for facial expression recogni- tion,
M. Aouayeb, W. Hamidouche, C. Soladie, K. Kpalma, and R. Seguier, “Learning vision transformer with squeeze and excitation for facial expression recogni- tion,” arXiv:2107.03107, 2021
-
[70]
J. She, Y . Hu, H. Shi, J. Wang, Q. Shen, and T. Mei, “Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR) , Virtual, Jun. 2021, pp. 6248–6257
work page 2021
-
[71]
A novel attention residual network expression recognition method,
H. Qi, X. Zhang, Y . Shi, and X. Qi, “A novel attention residual network expression recognition method,” IEEE Access, vol. 12, pp. 24 609–24 620, 2024
work page 2024
-
[72]
Pose-aware facial expression recognition assisted by expression descriptions,
S. Wang, Y . Wu, Y . Chang, G. Li, and M. Mao, “Pose-aware facial expression recognition assisted by expression descriptions,” IEEE Trans. Affective Com- put., vol. 15, no. 1, pp. 241–253, 2024
work page 2024
-
[73]
Human emotion recognition with relational region-level analysis,
W. Li, X. Dong, and Y . Wang, “Human emotion recognition with relational region-level analysis,” IEEE Trans. Affective Comput. , vol. 14, no. 1, pp. 650–663, 2023
work page 2023
-
[74]
Label distribution learning on auxiliary label space graphs for facial expression recognition,
S. Chen, J. Wang, Y . Chen, Z. Shi, X. Geng, and Y . Rui, “Label distribution learning on auxiliary label space graphs for facial expression recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), Virtual, Jun. 2020, pp. 13 984–13 993
work page 2020
-
[75]
Facial expression recognition in the wild using multi-level fea- tures and attention mechanisms,
Y . Li, G. Lu, J. Li, Z. Zhang, and D. Zhang, “Facial expression recognition in the wild using multi-level fea- tures and attention mechanisms,” IEEE Trans. Affective Comput., vol. 14, no. 1, pp. 451–462, 2020, Publisher: IEEE
work page 2020
-
[76]
A. Howard et al., “Searching for mobilenetv3,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV) , Seoul, South Korea, Oct. 2019, pp. 1314–1324. Chunwei Tian (Senior Member, IEEE) received the Ph.D. degree from Harbin Institute of Tech- nology, Harbin, China, in 2021. He is currently a Professor with the School of Computer Science and Technology, Harbin Instit...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.