pith. sign in

arxiv: 2604.09288 · v1 · submitted 2026-04-10 · 💻 cs.LG

Are Independently Estimated View Uncertainties Comparable? Unified Routing for Trusted Multi-View Classification

Pith reviewed 2026-05-10 17:30 UTC · model grok-4.3

classification 💻 cs.LG
keywords multi-view classificationevidential fusionuncertainty estimationunified routingmixture of expertstrusted learningsample-level reliability
0
0 comments X

The pith

Independently trained view branches produce uncertainties that cannot be directly compared for fusion because each optimizes only for its own accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that trusted multi-view classification rests on a fragile assumption: that evidence and uncertainty numbers from separate views can be added or averaged as if they sit on the same scale. Each view branch learns in isolation to maximize its own prediction accuracy, so nothing stops the strength of its evidence from drifting relative to the others. As a result, the uncertainties fed into fusion often reflect arbitrary per-branch scaling rather than how reliable each view actually is for a given sample. The authors replace local arbitration with a single router that sees the full multi-view input and learns to weight experts accordingly, backed by load-balancing and diversity terms.

Core claim

Trusted multi-view classification assumes evidence from independently trained view branches is numerically comparable for evidential fusion. In practice each branch optimizes only for its own prediction correctness, leaving no mechanism to align evidence strength across views; the resulting uncertainties are therefore dominated by branch-specific scale bias instead of sample-level reliability. TMUR decouples view-private evidence extraction from fusion by introducing private experts, one collaborative expert, and a unified router that observes global multi-view context to produce sample-level weights; soft load-balancing and diversity regularization further stabilize expert use. Theoretical

What carries the argument

Unified router that observes global multi-view context to generate sample-level expert weights, together with view-private experts and one collaborative expert.

If this is right

  • Global routing recovers sample-level reliability where branch-local arbitration cannot.
  • Load-balancing and diversity regularization promote balanced expert utilization and specialization.
  • Independent evidential supervision fails to identify any common cross-view evidence scale.
  • Decoupling evidence extraction from arbitration improves handling of sample-dependent reliability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same global-context routing idea could be tested on other multi-modal fusion tasks where reliability varies by input without needing explicit reliability labels.
  • An alternative might be to add an explicit cross-view consistency loss on evidence magnitude during training and measure whether it matches the router's gains.
  • The approach suggests viewing reliability arbitration as a context-dependent routing problem rather than a fixed per-view property.

Load-bearing premise

A learned unified router observing global context, together with load-balancing and diversity regularization, will recover sample-level reliability without introducing new fitting artifacts or requiring extra labeled reliability data.

What would settle it

On a multi-view benchmark where one view's evidence is artificially multiplied by a fixed scale factor, compare whether TMUR down-weights that view on unreliable samples while standard independent fusion does not.

Figures

Figures reproduced from arXiv: 2604.09288 by Cai Xu, Haishun Chen, Wei Zhao, Yilin Zhang, Ziyu Guan.

Figure 1
Figure 1. Figure 1: (a) Conventional trusted multi-view fusion uses branch-local self-assessed uncertainty for weighting. (b) However, [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of TMUR. Aligned per-view features are sent to view-private experts for view-specific evidence extraction, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Uncertainty distributions of TMUR under increas [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Hyperparameter sensitivity on six datasets. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Joint 𝛽–𝛾 sensitivity surfaces of TMUR on the remaining eight datasets not shown in the main paper. Each subplot shows the five-seed accuracy under the corresponding two-dimensional hyperparameter sweep. E.4 Motivation Analysis via Cross-view Calibration Mismatch [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Motivation analysis using saved calibration results of TMC and RCML on four representative datasets. Within each [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
read the original abstract

Trusted multi-view classification typically relies on a view-wise evidential fusion process: each view independently produces class evidence and uncertainty, and the final prediction is obtained by aggregating these independent opinions. While this design is modular and uncertainty-aware, it implicitly assumes that evidence from different views is numerically comparable. In practice, however, this assumption is fragile. Different views often differ in feature space, noise level, and semantic granularity, while independently trained branches are optimized only for prediction correctness, without any constraint enforcing cross-view consistency in evidence strength. As a result, the uncertainty used for fusion can be dominated by branch-specific scale bias rather than true sample-level reliability. To address this issue, we propose Trusted Multi-view learning with Unified Routing (TMUR), which decouples view-specific evidence extraction from fusion arbitration. TMUR uses view-private experts and one collaborative expert, and employs a unified router that observes the global multi-view context to generate sample-level expert weights. Soft load-balancing and diversity regularization further encourage balanced expert utilization and more discriminative expert specialization. We also provide theoretical analysis showing why independent evidential supervision does not identify a common cross-view evidence scale, and why unified global routing is preferable to branch-local arbitration when reliability is sample-dependent.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that independently estimated uncertainties in trusted multi-view classification are not numerically comparable across views. Independent per-view evidential training optimizes only for classification accuracy and imposes no cross-view scale consistency, allowing branch-specific biases to dominate fusion. The authors propose TMUR, which decouples evidence extraction (via view-private experts plus one collaborative expert) from fusion arbitration via a unified router that observes global multi-view context to produce sample-level weights; soft load-balancing and diversity regularization are added to encourage balanced utilization and specialization. A theoretical analysis is provided to show why local evidential supervision cannot identify a common evidence scale and why global routing is preferable when reliability is sample-dependent.

Significance. If the central claim holds, the work is significant for trusted multi-view learning: it directly challenges the implicit comparability assumption underlying evidential fusion methods and supplies both a theoretical grounding and a practical router-based architecture. The explicit theoretical analysis of non-identifiability under independent supervision and the use of global context plus regularization are strengths that could improve reliability in safety-critical multi-view applications.

major comments (3)
  1. [§4] §4 (theoretical analysis): The argument that independent evidential supervision fails to identify a common cross-view evidence scale is load-bearing for the motivation. A concrete derivation or simple counter-example (e.g., two views with different evidence scales that yield identical per-view losses but inconsistent fused uncertainty) should be supplied to make the non-identifiability explicit rather than asserted.
  2. [§5] §5 (experiments): The claim that the unified router recovers sample-level reliability (rather than merely compensating for training-distribution biases) is central yet rests on downstream classification accuracy alone. Additional experiments are needed that inject controlled scale biases, evaluate on out-of-distribution samples, or compare against oracles with explicit reliability labels to demonstrate that the router generalizes beyond fitting artifacts.
  3. [§3.2] §3.2 (router and regularization): The load-balancing and diversity coefficients are free parameters whose values affect whether the router truly reflects per-sample reliability or introduces new spurious correlations. Sensitivity analysis or ablation on these coefficients should be reported to confirm robustness of the reported gains.
minor comments (2)
  1. Notation for expert weights and uncertainty outputs should be unified across the method and theory sections to avoid ambiguity when comparing local vs. global arbitration.
  2. Figure captions for the architecture diagram should explicitly label the global context input to the router and the flow of evidence vs. weights.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments, which highlight important aspects for strengthening the theoretical motivation, experimental validation, and robustness analysis. We address each major comment point by point below and will incorporate the suggested revisions in the updated manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (theoretical analysis): The argument that independent evidential supervision fails to identify a common cross-view evidence scale is load-bearing for the motivation. A concrete derivation or simple counter-example (e.g., two views with different evidence scales that yield identical per-view losses but inconsistent fused uncertainty) should be supplied to make the non-identifiability explicit rather than asserted.

    Authors: We agree that a concrete counter-example would make the non-identifiability of the common evidence scale more explicit and strengthen the motivation. In the revised Section 4, we will add a simple derivation and counter-example involving two views with differing evidence scales. The example will show that these views can achieve identical per-view evidential losses under independent supervision yet produce inconsistent fused uncertainties, illustrating why local supervision cannot enforce cross-view scale consistency. revision: yes

  2. Referee: [§5] §5 (experiments): The claim that the unified router recovers sample-level reliability (rather than merely compensating for training-distribution biases) is central yet rests on downstream classification accuracy alone. Additional experiments are needed that inject controlled scale biases, evaluate on out-of-distribution samples, or compare against oracles with explicit reliability labels to demonstrate that the router generalizes beyond fitting artifacts.

    Authors: The existing experiments demonstrate consistent gains in accuracy and uncertainty calibration across standard multi-view benchmarks, supporting the router's utility. To more directly validate that the router captures sample-level reliability rather than merely compensating for training biases, we will add controlled experiments in the revision: injecting synthetic scale biases into view uncertainties to test correction, and evaluating on out-of-distribution samples. While a full oracle comparison using explicit reliability labels would require new annotated datasets beyond the current scope, the proposed additions will provide stronger evidence of generalization. revision: partial

  3. Referee: [§3.2] §3.2 (router and regularization): The load-balancing and diversity coefficients are free parameters whose values affect whether the router truly reflects per-sample reliability or introduces new spurious correlations. Sensitivity analysis or ablation on these coefficients should be reported to confirm robustness of the reported gains.

    Authors: We agree that sensitivity analysis on the load-balancing and diversity coefficients is important to confirm that the reported improvements are robust and not artifacts of specific hyperparameter choices. In the revised manuscript, we will include an ablation study in Section 5 that varies these coefficients over a range of values and reports the resulting performance to demonstrate stability of the gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theoretical analysis and router design remain independent of fitted outputs

full rationale

The paper's core derivation consists of a theoretical argument showing that independent evidential supervision cannot enforce a common cross-view evidence scale, followed by a proposed unified router trained end-to-end on classification loss. This theory is presented as first-principles reasoning about optimization constraints and does not reduce to the router's learned weights or any fitted parameter by construction. The router itself is a trainable module whose weights are optimized for downstream accuracy rather than being renamed from a prior fit; empirical validation on benchmarks provides external grounding. No self-citations, self-definitional equations, or 'predictions' that collapse to inputs appear in the abstract or described sections. The design choices (load-balancing, diversity regularization) are motivated by the theory but do not tautologically presuppose the method's success.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 3 invented entities

Central claim rests on the domain assumption that view-specific training produces incomparable evidence scales and on the architectural postulate that a learned global router plus regularization will recover sample-dependent reliability.

free parameters (2)
  • router output weights
    Sample-level expert weights produced by the unified router and optimized during training.
  • load-balancing and diversity coefficients
    Hyperparameters controlling soft load-balancing and diversity regularization terms.
axioms (2)
  • domain assumption Different views differ in feature space, noise level, and semantic granularity, so independently optimized evidence lacks a common numerical scale.
    Explicitly stated as the core fragility of current evidential fusion.
  • domain assumption Reliability is sample-dependent rather than view-dependent.
    Used to argue that branch-local arbitration is insufficient.
invented entities (3)
  • view-private experts no independent evidence
    purpose: Extract view-specific class evidence and uncertainty
    New modular component per view.
  • collaborative expert no independent evidence
    purpose: Provide shared evidence across views
    Introduced to complement private experts.
  • unified router no independent evidence
    purpose: Generate global sample-level expert weights from multi-view context
    Central new mechanism for arbitration.

pith-pipeline@v0.9.0 · 5528 in / 1389 out tokens · 94335 ms · 2026-05-10T17:30:37.084047+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

  1. [1]

    Bing Cao, Yinan Xia, Yi Ding, Changqing Zhang, and Qinghua Hu. 2024. Pre- dictive dynamic fusion. InProceedings of the 41st International Conference on Machine Learning. 5608–5628

  2. [2]

    Mengyuan Chen, Junyu Gao, and Changsheng Xu. 2025. Revisiting essential and nonessential settings of evidential deep learning.IEEE Transactions on Pattern Analysis and Machine Intelligence(2025)

  3. [3]

    Danruo Deng, Guangyong Chen, Yang Yu, Furui Liu, and Pheng-Ann Heng. 2023. Uncertainty estimation by fisher information-based evidential deep learning. In International conference on machine learning. PMLR, 7596–7616

  4. [4]

    Xiaorui Ding, Huan Ma, and Changqing Zhang. 2025. A Theoretical Proof of Dynamic Multimodal Fusion Exacerbates Modality Greedy. InProceedings of the 33rd ACM International Conference on Multimedia. 2429–2436

  5. [5]

    Zhicheng Dong, Xiaodong Yue, Yufei Chen, and Yuxian Zhou. 2025. Trusted Open-World Multi-View Classification with Dynamic Opinion Aggregation. In Proceedings of the 33rd ACM International Conference on Multimedia. 1181–1189

  6. [6]

    Siyuan Duan, Yuan Sun, Dezhong Peng, Guiduo Duan, Xi Peng, and Peng Hu

  7. [7]

    InProceedings of the 42nd International Conference on Machine Learning

    Deep fuzzy multi-view learning for reliable classification. InProceedings of the 42nd International Conference on Machine Learning. PMLR

  8. [8]

    Zihan Fang, Zhiyong Xu, Lan Du, Shide Du, Zhiling Cai, and Shiping Wang. 2025. Enhancing Multi-view Open-set Learning via Ambiguity Uncertainty Calibration and View-wise Debiasing. InProceedings of the 33rd ACM International Conference on Multimedia. 1220–1228

  9. [9]

    Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. InProceedings of the 34th International Conference on Machine Learning. PMLR, 1321–1330

  10. [10]

    Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2021. Trusted Multi-View Classification. InInternational Conference on Learning Representations. https://openreview.net/forum?id=OOsR8BzCnl5

  11. [11]

    Zongbo Han, Changqing Zhang, Huazhu Fu, and Joey Tianyi Zhou. 2023. Trusted multi-view classification with dynamic evidential fusion.IEEE transactions on pattern analysis and machine intelligence45, 2 (2023), 2551–2566

  12. [12]

    Shaobo Hu, Hui Huang, Nan Zhang, and Shiliang Sun. 2026. Robust Trusted Conflictive Multiview Collaborative Contrastive Learning.IEEE Transactions on Pattern Analysis and Machine Intelligence(2026)

  13. [13]

    Shizhe Hu, Binyan Tian, Weibo Liu, and Yangdong Ye. 2025. Self-supervised trusted contrastive multi-view clustering with uncertainty refined. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 17305–17313

  14. [14]

    Haojian Huang, Chuanyu Qin, Zhe Liu, Kaijing Ma, Jin Chen, Han Fang, Chao Ban, Hao Sun, and Zhongjiang He. 2025. Trusted unified feature-neighborhood dynamics for multi-view classification. InProceedings of the AAAI conference on artificial intelligence, Vol. 39. 17413–17421

  15. [15]

    Yaohui Huang, Runmin Zou, Yun Wang, Laeeq Aslam, and Ruipeng Dong

  16. [16]

    doi:10.1609/aaai.v40i26.39362

    M2FMoE: Multi-Resolution Multi-View Frequency Mixture-of-Experts for Extreme-Adaptive Time Series Forecasting.Proceedings of the AAAI Conference on Artificial Intelligence40, 26 (2026), 22075–22083. doi:10.1609/aaai.v40i26.39362

  17. [17]

    Jacobs, Michael I

    Robert A. Jacobs, Michael I. Jordan, Steven J. Nowlan, and Geoffrey E. Hinton

  18. [18]

    Adaptive Mixtures of Local Experts

    Adaptive Mixtures of Local Experts.Neural Computation3, 1 (1991), 79–87. doi:10.1162/neco.1991.3.1.79

  19. [19]

    2018.Subjective Logic: A Formalism for Reasoning Under Uncer- tainty(1st ed.)

    Audun Jøsang. 2018.Subjective Logic: A Formalism for Reasoning Under Uncer- tainty(1st ed.). Springer Publishing Company, Incorporated

  20. [20]

    Yuhang Lan, Shilin Xu, Chao Su, Run Ye, Dezhong Peng, and Yuan Sun. 2025. Multi-view Hashing Classification. InProceedings of the 33rd ACM International Conference on Multimedia. 2122–2130

  21. [21]

    Changbin Li, Kangshuo Li, Yuzhe Ou, Lance M Kaplan, Audun Jøsang, Jin-Hee Cho, DONG HYUN JEONG, and Feng Chen. [n. d.]. Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty. InThe Twelfth Inter- national Conference on Learning Representations

  22. [22]

    Xinyan Liang, Pinhan Fu, Yuhua Qian, Qian Guo, and Guoqing Liu. 2025. Trusted multi-view classification via evolutionary multi-view fusion. InThe Thirteenth International Conference on Learning Representations

  23. [23]

    Xinyan Liang, Shijie Wang, Yuhua Qian, Qian Guo, Liang Du, Bingbing Jiang, Tingjin Luo, and Feijiang Li. 2025. Trusted multi-view classification with expert knowledge constraints. InForty-second International Conference on Machine Learning. PMLR

  24. [24]

    Yuena Lin, Yiyuan Wang, Gengyu Lyu, Yongjian Deng, Haichun Cai, Huibin Lin, Haobo Wang, and Zhen Yang. 2025. Enhance multi-view classification through multi-scale alignment and expanded boundary. InThe Thirteenth International Conference on Learning Representations

  25. [25]

    Wei Liu, Yufei Chen, and Xiaodong Yue. 2024. Building trust in decision with conformalized multi-view deep classification. InProceedings of the 32nd ACM International Conference on Multimedia. 7278–7287

  26. [26]

    Wei Liu, Yufei Chen, and Xiaodong Yue. 2025. Enhancing testing-time robustness for trusted multi-view classification in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15508–15517

  27. [27]

    Wei Liu, Xiaodong Yue, Yufei Chen, and Thierry Denoeux. 2022. Trusted multi- view deep learning with opinion aggregation. InProceedings of the AAAI Confer- ence on Artificial Intelligence, Vol. 36. 7585–7593

  28. [28]

    Jueqing Lu, Wray Buntine, Yuanyuan Qi, Joanna Dipnall, Belinda Gabbe, and Lan Du. 2025. Navigating Conflicting Views: Harnessing Trust for Learning. In Proceedings of the 42nd International Conference on Machine Learning. PMLR

  29. [29]

    Yalan Qin, Guorui Feng, and Xinpeng Zhang. 2026. Multi-view Learning via Trusted Pairwise Entity Energy. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 24954–24962

  30. [30]

    Murat Sensoy, Lance Kaplan, and Melih Kandemir. 2018. Evidential deep learning to quantify classification uncertainty.Advances in neural information processing systems31 (2018)

  31. [31]

    1976.A Mathematical Theory of Evidence

    Glenn Shafer. 1976.A Mathematical Theory of Evidence. Princeton University Press

  32. [32]

    Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. InInternational Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlg

  33. [33]

    Jie Shi, Xiaodong Yue, Wei Liu, Yufei Chen, and Feifan Dong. 2026. Not All Inconsistency Is Equal: Decomposing LVLM Uncertainty into Belief Divergence and Belief Conflict. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 25339–25347

  34. [34]

    Victor Yukio Shirasuna, Eduardo Soares, Emilio Vital Brazil, Karen Fiorella Aquino Gutierrez, Renato Cerqueira, Seiji Takeda, and Akihiro Kishimoto. 2024. A multi-view mixture-of-experts based on language and graphs for molecular properties prediction. InICML 2024 AI for Science Workshop

  35. [35]

    Meng Wang, Tian Lin, Lianyu Wang, Aidi Lin, Ke Zou, Xinxing Xu, Yi Zhou, Yuanyuan Peng, Qingquan Meng, Yiming Qian, et al. 2023. Uncertainty-inspired open set learning for retinal anomaly identification.Nature Communications14, 1 (2023), 6757

  36. [36]

    Jiayi Xin, Sukwon Yun, Jie Peng, Inyoung Choi, Jenna L Ballard, Tianlong Chen, and Qi Long. 2025. I2MoE: Interpretable Multimodal Interaction-aware Mixture- of-Experts. InProceedings of the 42nd International Conference on Machine Learn- ing. PMLR

  37. [37]

    Cai Xu, Jiajun Si, Ziyu Guan, Wei Zhao, Yue Wu, and Xiyue Gao. 2024. Reliable conflictive multi-view learning. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 16129–16137

  38. [38]

    Cai Xu, Ziqi Wen, Jie Zhao, Wanqing Zhao, Jinlong Yu, Haishun Chen, Ziyu Guan, and Wei Zhao. 2025. Beyond Equal Views: Strength-Adaptive Evidential Multi-View Learning. InProceedings of the 33rd ACM International Conference on Multimedia. 1278–1287

  39. [39]

    Shilin Xu, Yuan Sun, Xingfeng Li, Siyuan Duan, Zhenwen Ren, Zheng Liu, and Dezhong Peng. 2025. Noisy label calibration for multi-view classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 21797–21805

  40. [40]

    Lotfi A Zadeh. 1986. A simple view of the Dempster-Shafer theory of evidence and its implication for the rule of combination.AI magazine7, 2 (1986), 85–85

  41. [41]

    Qingyang Zhang, Haitao Wu, Changqing Zhang, Qinghua Hu, Huazhu Fu, Joey Tianyi Zhou, and Xi Peng. 2023. Provable dynamic fusion for low-quality multimodal data. InInternational conference on machine learning. PMLR, 41753– 41769

  42. [42]

    Yunhe Zhang, Jinyu Cai, Zhihao Wu, Pengyang Wang, and See-Kiong Ng. 2025. Mixture of experts as representation learner for deep multi-view clustering. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 22704–22713

  43. [43]

    0” to “9

    Xujing Zhou, Xiaodong Yue, Yufei Chen, and Linye Li. 2025. Refining Confusion and Ignorance in Trusted Multi-View Classification. InCompanion Proceedings of the ACM on Web Conference 2025. 1549–1553. 9 Are Independently Estimated View Uncertainties Comparable? A Overview of the Appendix This appendix includes the following supplementary materials: • algor...