Recognition: unknown
IMA-MoE: An Interpretable Modality-Aware Mixture-of-Experts Framework for Characterizing the Neurobiological Signatures of Binge Eating Disorder
Pith reviewed 2026-05-10 07:06 UTC · model grok-4.3
The pith
A modality-aware mixture-of-experts model encodes multimodal data as tokens to better distinguish binge eating disorder and identify sex-specific biological patterns.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
IMA-MoE encodes each heterogeneous measure as a distinct token in a mixture-of-experts setup that models cross-modal dependencies while preserving modality-specific traits, and uses a token-importance mechanism to quantify contributions; on the ABCD dataset this yields better differentiation of BED from controls than baselines and uncovers sex-specific patterns with hormones weighing more in female predictions.
What carries the argument
The token-importance mechanism applied to modality-encoded tokens within the mixture-of-experts architecture, which quantifies the predictive contribution of each individual measure.
If this is right
- Superior performance compared to baseline methods in classifying BED versus healthy controls.
- Revelation of sex-specific predictive patterns in the data.
- Hormonal measures play a more prominent role in predictions for females.
- Support for data-driven multimodal approaches to characterize neurobiological signatures of BED.
- Potential to enable more precise and personalized interventions for neuropsychiatric disorders.
Where Pith is reading between the lines
- Similar token-based modeling could improve understanding of other eating disorders or psychiatric conditions with complex multimodal data.
- The identified sex differences may point to tailored screening or treatment strategies based on biological sex.
- Future work could test whether these token importances predict treatment response or longitudinal outcomes.
- Applying the framework to other datasets might confirm if the patterns generalize beyond the ABCD cohort.
Load-bearing premise
That representing each data measure as an independent token and scoring its importance will capture genuine biological mechanisms rather than patterns unique to this dataset or introduced by the model.
What would settle it
Applying the same IMA-MoE model to an independent cohort of adolescents and observing no improvement over baselines or no replication of the sex-specific hormonal importance.
Figures
read the original abstract
Binge eating disorder (BED) is the most prevalent eating disorder. However, current diagnostic frameworks remain largely grounded in symptom-based criteria rather than underlying biological mechanisms, thereby limiting early detection and the development of biologically-informed interventions. Emerging studies have begun to investigate the neurobiological signatures of BED, yet their findings are often difficult to generalize due to the reliance on hypothesis-driven parametric models, single-modality analyses, and limited data diversity. Therefore, there is a critical need for advanced data-driven frameworks capable of modeling multimodal data to uncover generalizable and biologically meaningful signatures of BED. In this study, we propose the Interpretable Modality-Aware Mixture-of-Experts (IMA-MoE), a novel architecture designed to integrate heterogeneous neuroimaging, behavioral, hormonal, and demographic measures within a unified predictive framework. By encoding each measure as a distinct token, IMA-MoE enables flexible modeling of cross-modal dependencies while preserving modality-specific characteristics. We further introduce a token-importance mechanism to enhance interpretability by quantifying the contribution of each measure to model predictions. Evaluated on the large-scale Adolescent Brain Cognitive Development (ABCD) dataset, IMA-MoE demonstrates superior performance in differentiating BED from healthy controls compared with baseline methods, while revealing sex-specific predictive patterns, with hormonal measures contributing more prominently to prediction in females. Collectively, these findings highlight the promise of interpretable, data-driven multimodal modeling in advancing biologically-informed characterization of BED and facilitating more precise and personalized interventions in neuropsychiatric disorders.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes the Interpretable Modality-Aware Mixture-of-Experts (IMA-MoE) architecture for multimodal integration of neuroimaging, behavioral, hormonal, and demographic measures in the ABCD dataset. Each measure is encoded as a distinct token to model cross-modal dependencies while preserving modality-specific traits; a token-importance mechanism is introduced to quantify contributions to BED vs. healthy control predictions. The central claims are superior predictive performance over baselines and the discovery of sex-specific patterns, with hormonal measures contributing more prominently in females.
Significance. If the performance gains and interpretability results hold after rigorous validation, the work could advance data-driven multimodal modeling for neuropsychiatric disorders by moving beyond single-modality or hypothesis-driven approaches. The large-scale ABCD evaluation and explicit focus on interpretability via token importance are strengths that could support biologically-informed characterization of BED if the importance scores prove stable and aligned with external neurobiological evidence.
major comments (2)
- [Abstract] Abstract: The abstract asserts superior performance in differentiating BED from controls and sex-specific predictive patterns but provides no quantitative metrics (e.g., accuracy, AUC, F1), error bars, statistical tests, baseline method details, data-split procedures, or missing-modality handling. This absence makes it impossible to evaluate whether the stated claims are supported by the results.
- [Model architecture and results] Token-importance mechanism (described in the model architecture and results sections): The claim that this mechanism reveals biologically meaningful neurobiological signatures, including sex-specific hormonal contributions, is load-bearing for the paper's interpretive conclusions. However, the manuscript does not report stability of importance scores across cross-validation folds, permutation-baseline comparisons, or direct alignment with independent BED literature, leaving open the possibility that scores reflect dataset artifacts, demographic confounds, or MoE routing biases rather than stable biological signals.
minor comments (2)
- [Methods] Notation for token encoding and expert routing should be clarified with explicit equations or pseudocode to allow reproduction of the modality-aware fusion step.
- [Figures] Figure captions for importance visualizations should include the exact statistical procedure used to derive and threshold the reported contributions.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point-by-point below and commit to revisions that will strengthen the clarity and rigor of the manuscript without altering its core contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract asserts superior performance in differentiating BED from controls and sex-specific predictive patterns but provides no quantitative metrics (e.g., accuracy, AUC, F1), error bars, statistical tests, baseline method details, data-split procedures, or missing-modality handling. This absence makes it impossible to evaluate whether the stated claims are supported by the results.
Authors: We agree that the abstract would be strengthened by including key quantitative metrics. Although the full results (including AUC, accuracy, F1, statistical comparisons, cross-validation splits, and missing-modality handling via the token-based architecture) are reported in the Results and Methods sections, we will revise the abstract to concisely incorporate representative performance numbers, baseline details, and a brief note on data handling. This revision will make the abstract self-contained while respecting length limits. revision: yes
-
Referee: [Model architecture and results] Token-importance mechanism (described in the model architecture and results sections): The claim that this mechanism reveals biologically meaningful neurobiological signatures, including sex-specific hormonal contributions, is load-bearing for the paper's interpretive conclusions. However, the manuscript does not report stability of importance scores across cross-validation folds, permutation-baseline comparisons, or direct alignment with independent BED literature, leaving open the possibility that scores reflect dataset artifacts, demographic confounds, or MoE routing biases rather than stable biological signals.
Authors: We appreciate this important point on validating the interpretability claims. The manuscript introduces the token-importance mechanism and applies it to identify sex-specific patterns (e.g., greater hormonal contribution in females) on the ABCD data. To address potential concerns about stability and artifacts, the revised manuscript will add: (1) stability metrics (mean and standard deviation of importance scores across CV folds), (2) permutation-based baselines to compare against random routing, and (3) explicit discussion aligning the observed patterns with independent BED literature on sex differences and hormonal factors. These additions will provide stronger evidence that the scores capture biologically relevant signals. revision: yes
Circularity Check
No circularity: empirical architecture evaluated on external data
full rationale
The paper proposes a new neural architecture (IMA-MoE) that encodes measures as tokens and adds a token-importance mechanism, then reports empirical results on the independent ABCD dataset for BED vs. control classification. No equations, derivations, or first-principles claims are presented that reduce by construction to fitted parameters, self-citations, or renamed inputs. Performance superiority and sex-specific patterns are asserted from experimental comparisons rather than tautological definitions, leaving the derivation chain self-contained with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Altered brain network topology during successful response inhibition in children with binge eating
Martin, E., Schulz, K.P., Hildebrandt, T., Sysko, R., Berner, L., Li, X., 2025a. Altered brain network topology during successful response inhibition in children with binge eating. bioRxiv , 2025–12. Martin, E., Schulz, K.P., Hildebrandt, T., Sysko, R., Berner, L.A., Li, X., 2025b. Distinct attention network topology and dynamics and their relations with ...
2025
-
[2]
Multi-modal imaging genomics transformer: Attentive inte- gration of imaging with genomic biomarkers for schizophre- nia classification, in: 2025 IEEE 22nd International Sympo- sium on Biomedical Imaging (ISBI), IEEE. pp. 1–5. Weygandt, M., Schaefer, A., Schienle, A., Haynes, J.D.,
2025
-
[3]
Interpretable Alzheimer's Diagnosis via Multimodal Fusion of Regional Brain Experts
Multimodal fusion of regional brain experts for inter- pretable alzheimer’s disease diagnosis. arXiv preprint arXiv:2512.10966 . 10
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.