Missing-by-Design: Certifiable Modality Deletion for Revocable Multimodal Sentiment Analysis
Pith reviewed 2026-05-15 21:50 UTC · model grok-4.3
The pith
Missing-by-Design certifies deletion of specific modalities from multimodal sentiment models via targeted parameter updates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MBD establishes that property-aware representation learning combined with generator-based reconstruction and a saliency-driven Gaussian parameter update can produce a machine-verifiable Modality Deletion Certificate confirming removal of modality-specific information, while delivering competitive predictive performance on incomplete inputs and a practical privacy-utility trade-off as an efficient alternative to full retraining.
What carries the argument
The Modality Deletion Certificate generated by saliency-driven candidate selection followed by a calibrated Gaussian update on model parameters, which certifies removal of modality-specific information.
If this is right
- Multimodal models maintain strong predictive performance when one or more modalities are missing by using the generator-based reconstruction.
- Deletion requests can be fulfilled through targeted parameter changes without requiring full model retraining.
- The resulting certificate supplies machine-verifiable proof that modality-specific signals have been removed.
- A practical privacy-utility balance is achieved on standard benchmark datasets for sentiment analysis.
Where Pith is reading between the lines
- If the certificate holds under scrutiny, similar surgical updates could be adopted for other multimodal tasks where selective removal of input types is required.
- Regulatory frameworks might eventually mandate such certified deletion mechanisms for handling user requests in deployed multimodal systems.
- Robustness tests against reconstruction attacks on the updated parameters could be run to check for any undetected residual modality information.
Load-bearing premise
That saliency-driven candidate selection followed by a calibrated Gaussian update produces a machine-verifiable certificate that actually removes all modality-specific information without hidden leakage.
What would settle it
An experiment that trains a separate recovery model on the updated parameters and measures whether it can still predict information from the deleted modality above chance level on held-out test data.
Figures
read the original abstract
As multimodal systems increasingly process sensitive personal data, the ability to selectively revoke specific data modalities has become a critical requirement for privacy compliance and user autonomy. We present Missing-by-Design (MBD), a unified framework for revocable multimodal sentiment analysis that combines structured representation learning with a certifiable parameter-modification pipeline. Revocability is critical in privacy-sensitive applications where users or regulators may request removal of modality-specific information. MBD learns property-aware embeddings and employs generator-based reconstruction to recover missing channels while preserving task-relevant signals. For deletion requests, the framework applies saliency-driven candidate selection and a calibrated Gaussian update to produce a machine-verifiable Modality Deletion Certificate. Experiments on benchmark datasets show that MBD achieves strong predictive performance under incomplete inputs and delivers a practical privacy-utility trade-off, positioning surgical unlearning as an efficient alternative to full retraining.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Missing-by-Design (MBD), a unified framework for revocable multimodal sentiment analysis. It combines property-aware embeddings and generator-based reconstruction to handle missing modalities while preserving task signals, and for deletion requests applies saliency-driven candidate selection followed by a calibrated Gaussian parameter update to generate a machine-verifiable Modality Deletion Certificate. Experiments on benchmark datasets are claimed to show strong predictive performance under incomplete inputs together with a practical privacy-utility trade-off, positioning the method as an efficient surgical-unlearning alternative to full retraining.
Significance. If the Modality Deletion Certificate can be shown to eliminate modality-specific information without residual leakage, the framework would supply a concrete, efficient mechanism for user-driven modality revocation in privacy-sensitive multimodal systems. This would be a meaningful contribution to certifiable unlearning, especially for applications such as sentiment analysis that process entangled personal data. The reported efficiency gains over retraining would be a clear practical advantage once the soundness claims are substantiated.
major comments (3)
- Abstract: The assertion that the calibrated Gaussian update produces a 'machine-verifiable Modality Deletion Certificate' is unsupported by any equations, proof sketch, or bound on residual mutual information; without such a derivation the certifiability claim cannot be evaluated.
- Method section (saliency-driven candidate selection and Gaussian update): No argument is supplied showing that the update eliminates cross-modal correlations typical in sentiment analysis; the procedure may leave predictive information in the retained embedding space that the certificate does not detect.
- Experiments section: No quantitative results on certificate soundness (e.g., post-deletion mutual-information estimates, modality-specific probe accuracy, or leakage metrics) are reported, leaving the central privacy guarantee unverified.
minor comments (2)
- Abstract: The acronym 'MBD' is introduced without an explicit expansion on first use.
- Notation: The term 'property-aware embeddings' is used without a formal definition or reference to the precise loss terms that enforce the property.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which help us improve the clarity and rigor of our work on Missing-by-Design. We address each major comment below, proposing revisions to substantiate the certifiability claims.
read point-by-point responses
-
Referee: Abstract: The assertion that the calibrated Gaussian update produces a 'machine-verifiable Modality Deletion Certificate' is unsupported by any equations, proof sketch, or bound on residual mutual information; without such a derivation the certifiability claim cannot be evaluated.
Authors: We agree that the abstract's claim requires stronger theoretical support. In the revised manuscript, we will expand the Method section with a formal derivation of the Modality Deletion Certificate, including a proof sketch that the calibrated Gaussian update bounds the residual mutual information between the deleted modality and the model parameters to a negligible level, thereby making the certificate machine-verifiable through verification of the update parameters. revision: yes
-
Referee: Method section (saliency-driven candidate selection and Gaussian update): No argument is supplied showing that the update eliminates cross-modal correlations typical in sentiment analysis; the procedure may leave predictive information in the retained embedding space that the certificate does not detect.
Authors: The saliency-driven candidate selection identifies parameters with high influence on modality-specific predictions, and the subsequent Gaussian update is designed to perturb these parameters in a way that disrupts cross-modal correlations. While we believe the procedure achieves this based on the property-aware embeddings, we acknowledge the lack of an explicit argument. We will add a theoretical analysis in the revision demonstrating that the update reduces cross-modal mutual information, with the certificate serving as verification of this reduction. revision: partial
-
Referee: Experiments section: No quantitative results on certificate soundness (e.g., post-deletion mutual-information estimates, modality-specific probe accuracy, or leakage metrics) are reported, leaving the central privacy guarantee unverified.
Authors: We concur that empirical evidence for the certificate's soundness is essential. The current experiments focus on predictive performance and efficiency, but we will include additional results in the revised version, such as post-deletion mutual information estimates between modalities and probe classifier accuracies for modality-specific information, to quantify the leakage and validate the privacy guarantees. revision: yes
Circularity Check
No significant circularity in the MBD derivation chain
full rationale
The paper describes a framework that learns property-aware embeddings, uses generator-based reconstruction for missing channels, and applies saliency-driven candidate selection plus calibrated Gaussian update to generate a Modality Deletion Certificate. No equations or steps in the provided description reduce the certificate or the claimed removal of modality-specific information to a quantity defined by the same fitted parameters or by self-citation chains that bear the central load. The privacy-utility claims rest on experimental results on benchmark datasets rather than self-referential definitions or imported uniqueness theorems, leaving the derivation self-contained against external validation.
Axiom & Free-Parameter Ledger
free parameters (1)
- Gaussian calibration parameter
axioms (1)
- domain assumption Saliency scores accurately identify parameters carrying modality-specific information
invented entities (1)
-
Modality Deletion Certificate
no independent evidence
Forward citations
Cited by 1 Pith paper
-
EGAD: Entropy-Guided Adaptive Distillation for Token-Level Knowledge Transfer
EGAD adaptively distills LLM knowledge at the token level by using entropy to create a curriculum from low- to high-entropy tokens, adjust temperature, and switch between logits-only and feature-based branches.
Reference graph
Works this paper leans on
-
[1]
Yifan Zhan, Rui Yang, Junxian You, Mengjie Huang, Weibo Liu, and Xiaohui Liu. A systematic literature review on incomplete multimodal learning: techniques and challenges.Systems Science & Control Engineering, 13(1): 2467083, 2025
work page 2025
-
[2]
Hai Pham, Paul Pu Liang, Thomas Manzini, Louis-Philippe Morency, and Barnabás Póczos. Found in translation: Learning robust joint representations by cyclic translations between modalities. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 6892–6899, 2019
work page 2019
-
[3]
Multimodal and multi-view models for emotion recognition
Gustavo Aguilar, Viktor Rozgic, Weiran Wang, and Chao Wang. Multimodal and multi-view models for emotion recognition. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 991–1002, 2019
work page 2019
-
[4]
Enhancing sentence representation with visually-supervised multimodal pre-training
Zhe Li, Laurence T Yang, Xin Nie, BoCheng Ren, and Xianjun Deng. Enhancing sentence representation with visually-supervised multimodal pre-training. InProceedings of the 31st ACM International Conference on Multimedia, pages 5686–5695, 2023
work page 2023
-
[5]
Yuanzhi Wang, Yong Li, and Zhen Cui. Incomplete multimodality-diffused emotion recognition.Advances in Neural Information Processing Systems, 36:17117–17128, 2023
work page 2023
-
[6]
Zheng Lian, Lan Chen, Licai Sun, Bin Liu, and Jianhua Tao. Gcnet: Graph completion network for incomplete multimodal learning in conversation.IEEE Transactions on pattern analysis and machine intelligence, 45(7): 8419–8432, 2023
work page 2023
-
[7]
Grmi: Graph representation learning of multimodal data with incompleteness
Xian Xu, Xiao Xu, Xiang Li, and Guotong Xie. Grmi: Graph representation learning of multimodal data with incompleteness. InInternational Conference on Database Systems for Advanced Applications, pages 286–296. Springer, 2023
work page 2023
-
[8]
Ada2i: Enhancing modality balance for multimodal conversational emotion recognition
Cam-Van Thi Nguyen, The-Son Le, Anh-Tuan Mai, and Duc-Trong Le. Ada2i: Enhancing modality balance for multimodal conversational emotion recognition. InProceedings of the 32nd ACM International Conference on Multimedia, pages 9330–9339, 2024
work page 2024
-
[9]
Patient-centered and practical privacy to support ai for healthcare
Ruixuan Liu, Hong Kyu Lee, Sivasubramanium V Bhavani, Xiaoqian Jiang, Lucila Ohno-Machado, and Li Xiong. Patient-centered and practical privacy to support ai for healthcare. In2024 IEEE 6th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA), pages 265–272. IEEE, 2024
work page 2024
-
[10]
Md Abdur Rahman, Lamyaa Alqahtani, Amna Albooq, and Alaa Ainousah. A survey on security and privacy of large multimodal deep learning models: Teaching and learning perspective. In2024 21st Learning and Technology Conference (L&T), pages 13–18. IEEE, 2024
work page 2024
-
[11]
Privacy protection in deep multi-modal retrieval
Peng-Fei Zhang, Yang Li, Zi Huang, and Hongzhi Yin. Privacy protection in deep multi-modal retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 634–643, 2021
work page 2021
-
[12]
Nicola Fabiano. Affective computing and emotional data: Challenges and implications in privacy regulations, the ai act, and ethics in large language models.arXiv preprint arXiv:2509.20153, 2025
-
[13]
Guimin Hu, Ting-En Lin, Yi Zhao, Guangming Lu, Yuchuan Wu, and Yongbin Li. Unimse: Towards unified multimodal sentiment analysis and emotion recognition.arXiv preprint arXiv:2211.11256, 2022
-
[14]
Hoai-Duy Le, Guee-Sang Lee, Soo-Hyung Kim, Seungwon Kim, and Hyung-Jeong Yang. Multi-label multimodal emotion recognition with transformer-based fusion and emotion-level representation learning.Ieee Access, 11: 14742–14751, 2023
work page 2023
-
[15]
Mustaqeem Khan, Phuong-Nam Tran, Nhat Truong Pham, Abdulmotaleb El Saddik, and Alice Othmani. Memo- cmt: multimodal emotion recognition using cross-modal transformer-based feature fusion.Scientific reports, 15 (1):5473, 2025. 13 Missing-by-Design
work page 2025
-
[16]
Luwei Xiao, Xingjiao Wu, Shuwen Yang, Junjie Xu, Jie Zhou, and Liang He. Cross-modal fine-grained alignment and fusion network for multimodal aspect-based sentiment analysis.Information Processing & Management, 60 (6):103508, 2023
work page 2023
-
[17]
Pmr: Prototypical modal rebalance for multimodal learning
Yunfeng Fan, Wenchao Xu, Haozhao Wang, Junxiao Wang, and Song Guo. Pmr: Prototypical modal rebalance for multimodal learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20029–20038, 2023
work page 2023
-
[18]
Enhanced experts with uncertainty- aware routing for multimodal sentiment analysis
Zixian Gao, Disen Hu, Xun Jiang, Huimin Lu, Heng Tao Shen, and Xing Xu. Enhanced experts with uncertainty- aware routing for multimodal sentiment analysis. InProceedings of the 32nd ACM International Conference on Multimedia, pages 9650–9659, 2024
work page 2024
-
[19]
Yan Zhuang, Minhao Liu, Yanru Zhang, Jiawen Deng, and Fuji Ren. Tmdc: A two-stage modality denoising and complementation framework for multimodal sentiment analysis with missing and noisy modalities.arXiv preprint arXiv:2511.10325, 2025
-
[20]
Zhongliang Wei, Ruofan Chen, and Jing Sun. Msaf-cf: A multimodal sentiment analysis framework based on feature enhancement and cross-fusion.IEEE Access, 2025
work page 2025
-
[21]
Meta-learning for incomplete multimodal sentiment analysis
Geng Tu, Tianhao Wu, Xuan Luo, Xi Zeng, Wenjie Li, and Ruifeng Xu. Meta-learning for incomplete multimodal sentiment analysis. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2911–2915, 2025
work page 2025
-
[22]
Proxy-driven robust multimodal sentiment analysis with incomplete data
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, and Ning An. Proxy-driven robust multimodal sentiment analysis with incomplete data. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 22123–22138, 2025
work page 2025
-
[23]
A multimodal fusion network for student emotion recognition based on transformer and tensor product
Ao Xiang, Zongqing Qi, Han Wang, Qin Yang, and Danqing Ma. A multimodal fusion network for student emotion recognition based on transformer and tensor product. In2024 IEEE 2nd International Conference on Sensors, Electronics and Computer Engineering (ICSECE), pages 1–4. IEEE, 2024
work page 2024
-
[24]
Sijie Mai, Ying Zeng, and Haifeng Hu. Learning from the global view: Supervised contrastive learning of multimodal representation.Information Fusion, 100:101920, 2023
work page 2023
-
[25]
Confede: Contrastive feature decomposition for multimodal sentiment analysis
Jiuding Yang, Yakun Yu, Di Niu, Weidong Guo, and Yu Xu. Confede: Contrastive feature decomposition for multimodal sentiment analysis. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7617–7630, 2023
work page 2023
-
[26]
Ying Zeng, Wenjun Yan, Sijie Mai, and Haifeng Hu. Disentanglement translation network for multimodal sentiment analysis.Information Fusion, 102:102031, 2024
work page 2024
-
[27]
Rui Liu, Haolin Zuo, Zheng Lian, Björn W Schuller, and Haizhou Li. Contrastive learning based modality-invariant feature acquisition for robust multimodal emotion recognition with missing modalities.IEEE Transactions on Affective Computing, 15(4):1856–1873, 2024
work page 2024
-
[28]
Linan Zhu, Hongyan Zhao, Zhechao Zhu, Chenwei Zhang, and Xiangjie Kong. Multimodal sentiment analysis with unimodal label generation and modality decomposition.Information Fusion, 116:102787, 2025
work page 2025
-
[29]
Hessian-Free Online Certified Unlearn- ing, February 2025
Xinbao Qiao, Meng Zhang, Ming Tang, and Ermin Wei. Hessian-free online certified unlearning.arXiv preprint arXiv:2404.01712, 2024
-
[30]
Jiaqi Li, Qianshan Wei, Chuanyi Zhang, Guilin Qi, Miaozeng Du, Yongrui Chen, Sheng Bi, and Fan Liu. Single image unlearning: Efficient machine unlearning in multimodal large language models.Advances in Neural Information Processing Systems, 37:35414–35453, 2024
work page 2024
-
[31]
Multidelete for multimodal machine unlearning
Jiali Cheng and Hadi Amiri. Multidelete for multimodal machine unlearning. InEuropean Conference on Computer Vision, pages 165–184. Springer, 2024
work page 2024
-
[32]
Jiaqi Liu, Jian Lou, Zhan Qin, and Kui Ren. Certified minimax unlearning with generalization rates and deletion capacity.Advances in Neural Information Processing Systems, 36:62821–62852, 2023
work page 2023
-
[33]
Aaradhya Pandey, Arnab Auddy, Haolin Zou, Arian Maleki, and Sanjeev Kulkarni. Gaussian certified unlearning in high dimensions: A hypothesis testing approach.arXiv preprint arXiv:2510.13094, 2025. 14 Missing-by-Design
-
[34]
Zheyuan Liu, Guangyao Dou, Xiangchi Yuan, Chunhui Zhang, Zhaoxuan Tan, and Meng Jiang. Modality-aware neuron pruning for unlearning in multimodal large language models.arXiv preprint arXiv:2502.15910, 2025
-
[35]
Protecting privacy in multimodal large language models with mllmu-bench
Zheyuan Liu, Guangyao Dou, Mengzhao Jia, Zhaoxuan Tan, Qingkai Zeng, Yongle Yuan, and Meng Jiang. Protecting privacy in multimodal large language models with mllmu-bench. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4...
work page 2025
-
[36]
Practical membership inference attacks against large-scale multi-modal models: A pilot study
Myeongseob Ko, Ming Jin, Chenguang Wang, and Ruoxi Jia. Practical membership inference attacks against large-scale multi-modal models: A pilot study. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 4871–4881, 2023
work page 2023
-
[37]
Lu Wang, Tianyuan Zhang, Yang Qu, Siyuan Liang, Yuwei Chen, Aishan Liu, Xianglong Liu, and Dacheng Tao. Black-box adversarial attack on vision language models for autonomous driving.arXiv preprint arXiv:2501.13563, 2025
-
[38]
Trishna Chakraborty, Erfan Shayegani, Zikui Cai, Nael B Abu-Ghazaleh, M Salman Asif, Yue Dong, Amit Roy-Chowdhury, and Chengyu Song. Can textual unlearning solve cross-modality safety alignment? InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 9830–9844, 2024
work page 2024
-
[39]
Zhen Zeng, Leijiang Gu, Zhangling Duan, Feng Li, Zenglin Shi, Cees GM Snoek, and Meng Wang. Towards benign memory forgetting for selective multimodal large language model unlearning.arXiv preprint arXiv:2511.20196, 2025
-
[40]
François Hublet, David Basin, and Sr ¯dan Krsti´c. User-controlled privacy: Taint, track, and control.Proceedings on Privacy Enhancing Technologies, 2024
work page 2024
-
[41]
J Revathy and Karthiga M. Cross-modal privacy-preserving synthesis and mixture-of-experts ensemble for robust asd prediction.Frontiers in Neuroinformatics, 19:1679196, 2025
work page 2025
-
[42]
Privacy-preserving multimodal sentiment analysis.IEEE Internet of Things Journal, 2025
Honghui Xu, Wei Li, Daniel Takabi, Daehee Seo, and Zhipeng Cai. Privacy-preserving multimodal sentiment analysis.IEEE Internet of Things Journal, 2025
work page 2025
-
[43]
Amir Zadeh, Rowan Zellers, Eli Pincus, and Louis-Philippe Morency. Multimodal sentiment intensity analysis in videos: Facial gestures and verbal messages.IEEE Intelligent Systems, 31(6):82–88, 2016
work page 2016
-
[44]
Memory fusion network for multi-view sequential learning
Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. Memory fusion network for multi-view sequential learning. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018
work page 2018
-
[45]
Sijie Mai, Ying Zeng, Shuangjia Zheng, and Haifeng Hu. Hybrid contrastive learning of tri-modal representation for multimodal sentiment analysis.IEEE Transactions on Affective Computing, 14(3):2276–2289, 2022
work page 2022
-
[46]
Zhuojia Wu, Qi Zhang, Duoqian Miao, Kun Yi, Wei Fan, and Liang Hu. Hydiscgan: A hybrid distributed cgan for audio-visual privacy preservation in multimodal sentiment analysis.arXiv preprint arXiv:2404.11938, 2024
-
[47]
Yang Yang, Xunde Dong, and Yupeng Qiang. Clgsi: a multimodal sentiment analysis framework based on contrastive learning guided by sentiment intensity. InFindings of the Association for Computational Linguistics: NAACL 2024, pages 2099–2110, 2024
work page 2024
-
[48]
Dlf: Disentangled-language-focused multimodal sentiment analysis
Pan Wang, Qiang Zhou, Yawen Wu, Tianlong Chen, and Jingtong Hu. Dlf: Disentangled-language-focused multimodal sentiment analysis. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 21180–21188, 2025
work page 2025
-
[49]
Changqin Huang, Zhenheng Lin, Zhongmei Han, Qionghao Huang, Fan Jiang, and Xiaodi Huang. Pamoe-msa: polarity-aware mixture of experts network for multimodal sentiment analysis.International Journal of Multimedia Information Retrieval, 14(1):1–16, 2025
work page 2025
-
[50]
Msamba: Exploring multimodal sentiment analysis with state space models
Xilin He, Haijian Liang, Boyi Peng, Weicheng Xie, Muhammad Haris Khan, Siyang Song, and Zitong Yu. Msamba: Exploring multimodal sentiment analysis with state space models. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 1309–1317, 2025
work page 2025
-
[51]
Iemocap: Interactive emotional dyadic motion capture database
Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N Chang, Sungbok Lee, and Shrikanth S Narayanan. Iemocap: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42(4):335–359, 2008. 15 Missing-by-Design
work page 2008
-
[52]
Two-stage finetuning of wav2vec 2.0 for speech emotion recognition with asr and gender pretraining
Yuan Gao, Chenhui Chu, and Tatsuya Kawahara. Two-stage finetuning of wav2vec 2.0 for speech emotion recognition with asr and gender pretraining. InProc. Interspeech, pages 3637–3641, 2023
work page 2023
-
[53]
Learning robust self-attention features for speech emotion recognition with label-adaptive mixup
Lei Kang, Lichao Zhang, and Dazhi Jiang. Learning robust self-attention features for speech emotion recognition with label-adaptive mixup. InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023
work page 2023
-
[54]
Improving speech emotion recognition with unsupervised speaking style transfer
Leyuan Qu, Wei Wang, Cornelius Weber, Pengcheng Yue, Taihao Li, and Stefan Wermter. Improving speech emotion recognition with unsupervised speaking style transfer. InICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 10101–10105. IEEE, 2024
work page 2024
-
[55]
Leveraging knowledge of modality experts for incomplete multimodal learning
Wenxin Xu, Hexin Jiang, and Xuefeng Liang. Leveraging knowledge of modality experts for incomplete multimodal learning. InProceedings of the 32nd ACM International Conference on Multimedia, pages 438–446, 2024
work page 2024
-
[56]
Lili Guo, Jie Li, Shifei Ding, and Jianwu Dang. Apin: Amplitude-and phase-aware interaction network for speech emotion recognition.Speech Communication, 169:103201, 2025
work page 2025
-
[57]
Yuanbo Fang, Xiaofen Xing, Zhaojie Chu, Yifeng Du, and Xiangmin Xu. Individual-aware attention modulation for unseen speaker emotion recognition.IEEE Transactions on Affective Computing, 2024
work page 2024
-
[58]
Gatem 2 former: Gated feature selection and expert modeling in multimodal emotion recognition
Weixiang Xu, Zhongren Dong, Runming Wang, Xinzhou Xu, and Zixing Zhang. Gatem 2 former: Gated feature selection and expert modeling in multimodal emotion recognition. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025
work page 2025
-
[59]
Qifei Li, Yingming Gao, Yuhua Wen, Ziping Zhao, Ya Li, and Björn W Schuller. Seenet: A soft emotion expert and data augmentation method to enhance speech emotion recognition.IEEE Transactions on Affective Computing, 2025
work page 2025
-
[60]
Haoyu Zhang, Wenbin Wang, and Tianshu Yu. Towards robust multimodal sentiment analysis with incomplete data.Advances in Neural Information Processing Systems, 37:55943–55974, 2024
work page 2024
-
[61]
Ramakrishna Vedantam, C Lawrence Zitnick, and Devi Parikh. Cider: Consensus-based image description evaluation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4566–4575, 2015. A Proofs and calibration details This appendix collects the full derivation of the DP-like indistinguishability bound used in the paper, supp...
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.