Human-computer interactions predict mental health

David Whitney; Jefferson Ortega; Veith Weilnhammer

arxiv: 2511.20179 · v5 · submitted 2025-11-25 · 🧬 q-bio.NC · cs.AI· cs.HC

Human-computer interactions predict mental health

Veith Weilnhammer , Jefferson Ortega , David Whitney This is my paper

Pith reviewed 2026-05-17 05:04 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.AIcs.HC

keywords mental healthmachine learningdigital phenotypinghuman-computer interactioncursor trackingtouchscreen dataself-reportmental state inference

0 comments

The pith

Everyday cursor and touchscreen interactions encode mental health states that machine learning can extract accurately.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that routine human-computer interactions carry detailed signals about mental health. The authors built MAILA to analyze cursor and touchscreen data from thousands of users and map it to 13 clinical dimensions of mental state. This approach detects daily rhythms, responses to mood changes, and information beyond what people report verbally. If correct, it offers a passive, scalable way to assess mental health using data generated during normal device use rather than relying solely on clinical visits or questionnaires.

Core claim

MAILA is a machine-learning framework trained on 18,200 cursor and touchscreen recordings paired with 1.3 million mental-health self-reports from 9,500 participants. It tracks dynamic mental states along 13 clinically relevant dimensions, resolves circadian fluctuations and experimental manipulations of arousal and valence, achieves near-ceiling accuracy at the group level, captures information only partially reflected in verbal self-report, and improves the ability of large language models to infer user mental health.

What carries the argument

MAILA, the MAchine-learning framework for Inferring Latent mental states from digital Activity, which extracts psychological signatures from patterns in cursor movements and touchscreen interactions.

If this is right

Mental health can be assessed continuously and scalably through normal device use.
Daily fluctuations in mental states become trackable without additional user effort.
Large language models gain better performance when inferring mental health from user context.
Digital phenotyping gains human-computer interaction as a new, untapped data modality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Apps could integrate similar tracking to provide real-time mental health insights during routine use.
The method may reduce reliance on self-reports alone in both research and clinical settings.
Generalization tests across devices, cultures, and clinical populations would clarify real-world limits.

Load-bearing premise

Self-reported mental health labels serve as accurate ground truth, and the interaction data contain no major unmeasured confounds from device type, task demands, demographics, or reporting biases.

What would settle it

An experiment in which participants undergo controlled manipulations of arousal or valence and the model fails to detect corresponding changes from their cursor or touchscreen data alone.

Figures

Figures reproduced from arXiv: 2511.20179 by David Whitney, Jefferson Ortega, Veith Weilnhammer.

**Figure 2.** Figure 2: Human-computer interactions predict mental health. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: MAILA predicts changes in mental health. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: MAILA generalizes to clinical populations. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Human-computer interactions predict three orthogonal dimensions of mental [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Human-computer interactions track group-level mental heatlh. [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: MAILA as a bridge between language and cognitive markers of mental [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Scalable assessments of mental illness remain a critical roadblock toward accessible and equitable care. Here, we show that everyday human-computer interactions encode mental health with biomarker accuracy. We introduce MAILA, a MAchine-learning framework for Inferring Latent mental states from digital Activity. We trained MAILA on 18,200 cursor and touchscreen recordings labeled with 1.3 million mental-health self-reports collected from 9,500 participants. MAILA tracks dynamic mental states along 13 clinically relevant dimensions, resolves circadian fluctuations and experimental manipulations of arousal and valence, achieves near-ceiling accuracy at the group level, captures information that is only partially reflected in verbal self-report, and improves the ability of large language models to infer user mental health. By extracting signatures of psychological function that have so far remained untapped, MAILA establishes human-computer interactions as a new modality for scalable digital phenotyping and a foundation for context-aware artificial intelligence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows that cursor and touchscreen patterns can be linked to 13 mental health dimensions at scale, but the claims rest on self-report labels without enough shown controls or external checks.

read the letter

The main takeaway is that MAILA uses a large set of everyday device interactions to infer mental states across 13 dimensions, with reported ability to track circadian shifts and lab-induced changes in arousal and valence while also boosting LLM inferences. The dataset size stands out: 9,500 participants and 1.3 million self-reports tied to 18,200 recordings give it real weight for exploring passive signals in digital phenotyping. If the features genuinely separate psychological information from noise, this could support low-effort monitoring ideas that go beyond what people type or say out loud. That part is worth noting as a practical step forward from smaller prior studies in the area. The soft spots are the dependence on self-reports as the sole ground truth and the missing details on how confounds were handled. Patterns in how people move a cursor or tap a screen could easily track reporting style, device type, task context, or demographics instead of independent mental states. The abstract asserts the model captures extra information not in verbal reports, yet without tests against clinical interviews, physiological markers, or clear validation splits and error bars, that separation stays unproven. The experimental manipulations are mentioned but not shown to rule out the same issues in the big observational data. This is for researchers working on scalable mental health tools and human-computer interaction signals. Readers building digital phenotyping systems might pick up useful framing or scale benchmarks, though they would likely need to add their own controls and external validators. It deserves a serious referee because the dataset and multi-dimensional framing are substantial enough to test properly rather than dismiss outright. I would send it for review with a focus on validation methods and alternative ground truths.

Referee Report

3 major / 2 minor

Summary. The paper claims that everyday human-computer interactions encode mental health with biomarker accuracy. It introduces MAILA, a machine-learning framework trained on 18,200 cursor and touchscreen recordings labeled with 1.3 million mental-health self-reports from 9,500 participants. MAILA is said to track dynamic mental states along 13 clinically relevant dimensions, resolve circadian fluctuations and experimental manipulations of arousal and valence, achieve near-ceiling accuracy at the group level, capture information only partially reflected in verbal self-report, and improve large language models' ability to infer user mental health.

Significance. If the central claims hold after addressing validation concerns, this work could significantly advance scalable digital phenotyping by establishing human-computer interactions as a new, passive modality for mental health assessment. The large-scale dataset and multi-dimensional approach represent a strength, potentially leading to more accessible care and context-aware AI systems.

major comments (3)

[Abstract] Abstract: The assertion that MAILA 'captures information that is only partially reflected in verbal self-report' is load-bearing for the novelty claim but is not supported by any described comparison to independent non-self-report criteria such as clinical interview scores or physiological markers.
[Methods] Methods: The manuscript provides insufficient detail on validation splits, error bars, cross-validation strategy, and explicit controls for confounds including device type, task demands, demographics, and reporting biases, which are critical to evaluate whether the near-ceiling group-level accuracy reflects latent mental states rather than spurious correlations with self-report style.
[Results] Results: The claim that MAILA resolves experimental manipulations of arousal and valence does not include reporting of how the large observational dataset controls for the same confounds, leaving the group-level accuracy vulnerable to alternative explanations.

minor comments (2)

[Abstract] The acronym expansion for MAILA could be formatted more clearly on first use to improve immediate readability.
Tables or figures presenting the 13 dimensions would benefit from explicit labeling of each dimension and associated performance metrics for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for their detailed and constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below. Where appropriate, we have revised the manuscript to address the concerns raised.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that MAILA 'captures information that is only partially reflected in verbal self-report' is load-bearing for the novelty claim but is not supported by any described comparison to independent non-self-report criteria such as clinical interview scores or physiological markers.

Authors: We thank the referee for this important observation. The statement in the abstract is based on supplementary analyses in the manuscript demonstrating that MAILA-derived predictions from human-computer interaction data account for unique variance in mental health self-reports not explained by other self-report measures alone. However, we acknowledge that this does not constitute validation against independent criteria such as clinical interviews or physiological markers, which are not available in our dataset. In the revised version, we will qualify this claim in the abstract and add a dedicated limitations section discussing the reliance on self-report labels and the need for future validation with clinical data. This revision will be incorporated. revision: yes
Referee: [Methods] Methods: The manuscript provides insufficient detail on validation splits, error bars, cross-validation strategy, and explicit controls for confounds including device type, task demands, demographics, and reporting biases, which are critical to evaluate whether the near-ceiling group-level accuracy reflects latent mental states rather than spurious correlations with self-report style.

Authors: We apologize for any lack of clarity in the methods description. The original manuscript includes participant-wise cross-validation to prevent data leakage, with error bars representing standard errors across cross-validation folds. To address the referee's concern, we will expand the Methods section with additional details on the validation strategy, including the exact split ratios and how confounds were controlled. Specifically, we will report subgroup analyses by device type and demographics, and include statistical controls for task demands and potential reporting biases where applicable. These enhancements will improve the transparency and allow readers to better assess the robustness of our findings. revision: yes
Referee: [Results] Results: The claim that MAILA resolves experimental manipulations of arousal and valence does not include reporting of how the large observational dataset controls for the same confounds, leaving the group-level accuracy vulnerable to alternative explanations.

Authors: Thank you for pointing this out. The experimental manipulations of arousal and valence were conducted in a controlled experimental arm of the study, separate from the large observational dataset, and we report the results after accounting for time-of-day and other basic variables. For the observational data, we have applied controls for circadian effects and basic demographics. We agree that more explicit documentation is warranted. In the revision, we will add a paragraph in the Results section detailing the confound control procedures applied to the observational dataset and include any relevant sensitivity analyses. This will help rule out alternative explanations and strengthen the interpretation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical supervised learning on held-out self-report labels

full rationale

The paper trains MAILA on cursor/touchscreen features to predict held-out self-reported mental health labels collected from participants. This is standard supervised machine learning with performance measured on data splits independent of the training set; the reported accuracy is not equivalent to the input labels by construction, nor does any derivation step reduce to a self-definition, fitted parameter renamed as prediction, or self-citation chain. Claims about capturing information partially independent of verbal self-report and resolving experimental manipulations are presented as empirical outcomes rather than definitional. The derivation chain is therefore self-contained against the external benchmark of the collected labels and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that self-report labels are valid proxies for latent mental states and that interaction patterns generalize beyond the sampled population and devices.

pith-pipeline@v0.9.0 · 5455 in / 1026 out tokens · 31324 ms · 2026-05-17T05:04:58.870913+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MAILA uses unsupervised representation learning to encode each participant’s cursor or touchscreen activity as a distribution over stereotyped movement patterns... LSTM autoencoder... K-means clusters... support vector regression
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We trained MAILA on 18,200 cursor and touchscreen recordings labeled with 1.3 million mental-health self-reports

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

68 extracted references · 68 canonical work pages

[1]

Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: A systematic analysis for the Global Burden of Disease Study 2019

GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: A systematic analysis for the Global Burden of Disease Study 2019 . The Lancet. Psychiatry 9, 137–150 (2022)

work page 2019
[2]

Mental disorders Factsheet

WHO. Mental disorders Factsheet

work page
[3]

Ghio, L. et al. Duration of untreated illness and outcomes in unipolar depression: A sys- tematic review and meta-analysis . Journal of Aﬀective Disorders 152-154, 45–51 (2014)

work page 2014
[4]

Pablo, G. S. de et al. What is the duration of untreated psychosis worldwide? – A meta- analysis of pooled mean and median time and regional trends and other correlates across 369 studies . Psychological Medicine 54, 652–662 (2024)

work page 2024
[5]

Kraus, C. et al. Prognosis and improved outcomes in major depression: A review . Transla- tional Psychiatry 9, 127 (2019)

work page 2019
[6]

Preece, D. A. et al. Alexithymia proﬁles and depression, anxiety, and stress . Journal of Aﬀective Disorders 357, 116–125 (2024)

work page 2024
[7]

Clement, S. et al. What is the impact of mental health-related stigma on help-seeking? A systematic review of quantitative and qualitative studies . Psychological Medicine 45, 11–27 (2015)

work page 2015
[8]

Lost in Translation

Miteva, D. et al. Impact of language proﬁciency on mental health service use, treatment and outcomes: "Lost in Translation" . Comprehensive Psychiatry 114, 152299 (2022)

work page 2022
[9]

Keynejad, R. C. et al. WHO Mental Health Gap Action Programme (mhGAP) Intervention Guide: A systematic review of evidence from low and middle-income countries . Evidence Based Mental Health 21, (2018)

work page 2018
[10]

Binz, M. et al. A foundation model to predict and capture human cognition . Nature 644, 1002–1009 (2025)

work page 2025
[11]

Dohnány, S. et al. Technological folie à deux: Feedback Loops Between AI Chatbots and Mental Illness. (2025) doi: 10.48550/arXiv.2507.19218

work page doi:10.48550/arxiv.2507.19218 2025
[12]

Galatzer-Levy, I. R. et al. Generative Psychometrics—An Emerging Frontier in Mental Health Measurement. JAMA Psychiatry (2025) doi: 10.1001/jamapsychiatry.2025.3258

work page doi:10.1001/jamapsychiatry.2025.3258 2025
[13]

Lewis, C. M. et al. Polygenic risk scores: From research tools to clinical instruments . Genome Medicine 12, 44 (2020)

work page 2020
[14]

Murray, G. K. et al. Could Polygenic Risk Scores Be Useful in Psychiatry?: A Review . JAMA Psychiatry 78, 210–219 (2021)

work page 2021
[15]

Sanchez-Roige, S. et al. Emerging phenotyping strategies will advance our understanding of psychiatric genetics . Nature neuroscience 23, 475–480 (2020)

work page 2020
[16]

Kambeitz, J. et al. Detecting Neuroimaging Biomarkers for Depression: A Meta-analysis of Multivariate Pattern Recognition Studies . Biological Psychiatry 82, 330–338 (2017). 74

work page 2017
[17]

Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals . Nature 2022 603:7902 603, 654–660 (2022)

work page 2022
[18]

Abd-Alrazaq, A. et al. Systematic review and meta-analysis of performance of wearable artiﬁcial intelligence in detecting and predicting depression . npj Digital Medicine 6, 84 (2023)

work page 2023
[19]

Liu, J. J. et al. Digital phenotyping from wearables using AI characterizes psychiatric disorders and identiﬁes genetic associations . Cell 188, 515–529.e15 (2025)

work page 2025
[20]

Xie, E. et al. JETS: A Self-Supervised Joint Embedding Time Series Foundation Model for Behavioral Data in Healthcare . in (2025)

work page 2025
[21]

Estimating the reproducibility of psychological science

Open Science Collaboration. Estimating the reproducibility of psychological science . Science 349, aac4716 (2015)

work page 2015
[22]

Eichstaedt, J. C. et al. Facebook language predicts depression in medical records . Proceed- ings of the National Academy of Sciences 115, 11203–11208 (2018)

work page 2018
[23]

Kelley, S. W. et al. Using language in social media posts to study the network dynamics of depression longitudinally . Nature Communications 13, 870 (2022)

work page 2022
[24]

Mirea, D.-M. et al. Cognitive modeling of real-world behavior for understanding mental health. Trends in Cognitive Sciences (2025) doi: 10.1016/j.tics.2025.07.009

work page doi:10.1016/j.tics.2025.07.009 2025
[25]

Freeman, J. B. Doing Psychological Science by Hand . Current Directions in Psychological Science 27, 315–323 (2018)

work page 2018
[26]

Jain, S. H. et al. The digital phenotype . Nature Biotechnology 33, 462–463 (2015)

work page 2015
[27]

Insel, T. R. Digital Phenotyping: Technology for a New Science of Behavior . JAMA 318, 1215–1216 (2017)

work page 2017
[28]

Wainberg, M. L. et al. Challenges and Opportunities in Global Mental Health: A Research- to-Practice Perspective . Current Psychiatry Reports 19, 28 (2017)

work page 2017
[29]

Barrett, P. M. et al. Digitising the mind . The Lancet 389, 1877 (2017)

work page 2017
[30]

Topol, E. J. High-performance medicine: The convergence of human and artiﬁcial intelli- gence. Nature Medicine 25, 44–56 (2019)

work page 2019
[31]

Opportunities and challenges in the collection and analysis of digital pheno- typing data

Onnela, J.-P. Opportunities and challenges in the collection and analysis of digital pheno- typing data . Neuropsychopharmacology 46, 45–54 (2021)

work page 2021
[32]

Hauser, T. U. et al. The promise of a model-based psychiatry: Building computational models of mental ill health . The Lancet Digital Health 4, e816–e828 (2022)

work page 2022
[33]

Koutsouleris, N. et al. From promise to practice: Towards the realisation of AI-informed mental health care . The Lancet Digital Health 4, e829–e840 (2022)

work page 2022
[34]

Galatzer-Levy, I. R. et al. Machine Learning and the Digital Measurement of Psychological Health. Annual Review of Clinical Psychology 19, 133–154 (2023)

work page 2023
[35]

Picard, R. W. Aﬀective computing / Rosalind W. Picard. (MIT Press, 1997)

work page 1997
[36]

Darwin, C. et al. The Expression of the Emotions in Man and Animals, Deﬁnitive Edition . (Oxford University Press, 1998). 75

work page 1998
[37]

Emotional and Conversational Nonverbal Signals

Ekman, P. Emotional and Conversational Nonverbal Signals. in Language, Knowledge, and Representation (eds. Larrazabal, J. M. et al.) 39–50 (Springer Netherlands, 2004)

work page 2004
[38]

Wolpert, D. M. et al. A unifying computational framework for motor control and social interaction. Philosophical Transactions of the Royal Society B: Biological Sciences 358, 593–602 (2003)

work page 2003
[39]

Shadmehr, R. et al. Error correction, sensory prediction, and adaptation in motor control . Annual Review of Neuroscience 33, 89–108 (2010)

work page 2010
[40]

Schoemann, M. et al. Using mouse cursor tracking to investigate online cognition: Preserving methodological ingenuity while moving toward reproducible science . Psychonomic Bulletin & Review 28, 766–787 (2021)

work page 2021
[41]

Freihaut, P. et al. Tracking stress via the computer mouse? Promises and challenges of a potential behavioral stress marker . Behavior Research Methods 53, 2281–2301 (2021)

work page 2021
[42]

De Angel, V. et al. Digital health tools for the passive monitoring of depression: A systematic review of methods . npj Digital Medicine 5, 3 (2022)

work page 2022
[43]

Insel, T. et al. Research domain criteria (RDoC): Toward a new classiﬁcation framework for research on mental disorders . The American Journal of Psychiatry 167, 748–751 (2010)

work page 2010
[44]

Kotov, R. et al. A paradigm shift in psychiatric classiﬁcation: The Hierarchical Taxonomy Of Psychopathology (HiTOP) . World Psychiatry 17, 24–25 (2018)

work page 2018
[45]

Kılıç, A. A. et al. Bogazici mouse dynamics dataset . Data in Brief 36, 107094 (2021)

work page 2021
[46]

Westerhof, G. J. et al. Mental Illness and Mental Health: The Two Continua Model Across the Lifespan . Journal of Adult Development 17, 110–119 (2010)

work page 2010
[47]

Saragosa-Harris, N. M. et al. Real-World Exploration Increases Across Adolescence and Relates to Aﬀect, Risk Taking, and Social Connectivity . Psychological Science 33, 1664– 1679 (2022)

work page 2022
[48]

Schurr, R. et al. Dynamic computational phenotyping of human cognition . Nature Human Behaviour 8, 917–931 (2024)

work page 2024
[49]

So, S. H. et al. Jumping to conclusions data-gathering bias in psychosis and other psychiatric disorders — Two meta-analyses of comparisons between patients and healthy individuals . Clinical Psychology Review 46, 151–167 (2016)

work page 2016
[50]

Gillan, C. M. et al. Smartphones and the Neuroscience of Mental Health . Annual Review of Neuroscience 44, 129–151 (2021)

work page 2021
[51]

Kuppens, P. et al. Emotional inertia and psychological maladjustment. Psychological science 21, 984–991 (2010)

work page 2010
[52]

Caspi, A. et al. All for One and One for All: Mental Disorders in One Dimension . American Journal of Psychiatry 175, 831–844 (2018)

work page 2018
[53]

Golder, S. A. et al. Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures . Science 333, 1878–1881 (2011). 76

work page 2011
[54]

Perris, F. et al. Duration of Untreated Illness in Patients with Obsessive-Compulsive Disor- der and Its Impact on Long-Term Outcome: A Systematic Review . Journal of Personalized Medicine 13, 1453 (2023)

work page 2023
[55]

Obermeyer, Z. et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019)

work page 2019
[56]

Keyes, K. M. et al. UK Biobank, big data, and the consequences of non-representativeness . Lancet (London, England) 393, 1297 (2019)

work page 2019
[57]

Karvelis, P. et al. Individual diﬀerences in computational psychiatry: A review of current challenges. Neuroscience & Biobehavioral Reviews 148, 105137 (2023)

work page 2023
[58]

Arena, A. F. et al. Mental health and unemployment: A systematic review and meta- analysis of interventions to improve depression and anxiety outcomes . Journal of Aﬀective Disorders 335, 450–472 (2023)

work page 2023
[59]

Charles, S. T. et al. Social and Emotional Aging . Annual Review of Psychology 61, 383–409 (2010)

work page 2010
[60]

Why is depression more common among women than among men? The Lancet Psychiatry 4, 146–158 (2017)

Kuehner, C. Why is depression more common among women than among men? The Lancet Psychiatry 4, 146–158 (2017)

work page 2017
[61]

Trepka, E. et al. Entropy-based metrics for predicting choice behavior based on local re- sponse to reward . Nature Communications 12, 6567 (2021)

work page 2021
[62]

Bennett, D. et al. The Two Cultures of Computational Psychiatry . JAMA Psychiatry 76, 563–564 (2019)

work page 2019
[63]

Torous, J. et al. The growing ﬁeld of digital psychiatry: Current evidence and the future of apps, social media, chatbots, and virtual reality . World psychiatry: oﬃcial journal of the World Psychiatric Association (WPA) 20, 318–335 (2021)

work page 2021
[64]

LeCun, Y. et al. Deep learning . Nature 521, 436–444 (2015)

work page 2015
[65]

Lebowitz, M. S. et al. Testing positive for a genetic predisposition to depression magniﬁes retrospective memory for depressive symptoms . Journal of Consulting and Clinical Psychol- ogy 85, 1052–1063 (2017)

work page 2017
[66]

Lekadir, K. et al. FUTURE-AI: International consensus guideline for trustworthy and de- ployable artiﬁcial intelligence in healthcare. (2025) doi: 10.1136/bmj-2024-081554

work page doi:10.1136/bmj-2024-081554 2025
[67]

Derogatis, L. R. et al. The Brief Symptom Inventory: An introductory report . Psychological Medicine 13, 595–605 (1983)

work page 1983
[68]

Keyes, C. L. M. et al. Evaluation of the mental health continuum-short form (MHC-SF) in setswana-speaking South Africans . Clinical Psychology & Psychotherapy 15, 181–192 (2008). 77

work page 2008

[1] [1]

Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: A systematic analysis for the Global Burden of Disease Study 2019

GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: A systematic analysis for the Global Burden of Disease Study 2019 . The Lancet. Psychiatry 9, 137–150 (2022)

work page 2019

[2] [2]

Mental disorders Factsheet

WHO. Mental disorders Factsheet

work page

[3] [3]

Ghio, L. et al. Duration of untreated illness and outcomes in unipolar depression: A sys- tematic review and meta-analysis . Journal of Aﬀective Disorders 152-154, 45–51 (2014)

work page 2014

[4] [4]

Pablo, G. S. de et al. What is the duration of untreated psychosis worldwide? – A meta- analysis of pooled mean and median time and regional trends and other correlates across 369 studies . Psychological Medicine 54, 652–662 (2024)

work page 2024

[5] [5]

Kraus, C. et al. Prognosis and improved outcomes in major depression: A review . Transla- tional Psychiatry 9, 127 (2019)

work page 2019

[6] [6]

Preece, D. A. et al. Alexithymia proﬁles and depression, anxiety, and stress . Journal of Aﬀective Disorders 357, 116–125 (2024)

work page 2024

[7] [7]

Clement, S. et al. What is the impact of mental health-related stigma on help-seeking? A systematic review of quantitative and qualitative studies . Psychological Medicine 45, 11–27 (2015)

work page 2015

[8] [8]

Lost in Translation

Miteva, D. et al. Impact of language proﬁciency on mental health service use, treatment and outcomes: "Lost in Translation" . Comprehensive Psychiatry 114, 152299 (2022)

work page 2022

[9] [9]

Keynejad, R. C. et al. WHO Mental Health Gap Action Programme (mhGAP) Intervention Guide: A systematic review of evidence from low and middle-income countries . Evidence Based Mental Health 21, (2018)

work page 2018

[10] [10]

Binz, M. et al. A foundation model to predict and capture human cognition . Nature 644, 1002–1009 (2025)

work page 2025

[11] [11]

Dohnány, S. et al. Technological folie à deux: Feedback Loops Between AI Chatbots and Mental Illness. (2025) doi: 10.48550/arXiv.2507.19218

work page doi:10.48550/arxiv.2507.19218 2025

[12] [12]

Galatzer-Levy, I. R. et al. Generative Psychometrics—An Emerging Frontier in Mental Health Measurement. JAMA Psychiatry (2025) doi: 10.1001/jamapsychiatry.2025.3258

work page doi:10.1001/jamapsychiatry.2025.3258 2025

[13] [13]

Lewis, C. M. et al. Polygenic risk scores: From research tools to clinical instruments . Genome Medicine 12, 44 (2020)

work page 2020

[14] [14]

Murray, G. K. et al. Could Polygenic Risk Scores Be Useful in Psychiatry?: A Review . JAMA Psychiatry 78, 210–219 (2021)

work page 2021

[15] [15]

Sanchez-Roige, S. et al. Emerging phenotyping strategies will advance our understanding of psychiatric genetics . Nature neuroscience 23, 475–480 (2020)

work page 2020

[16] [16]

Kambeitz, J. et al. Detecting Neuroimaging Biomarkers for Depression: A Meta-analysis of Multivariate Pattern Recognition Studies . Biological Psychiatry 82, 330–338 (2017). 74

work page 2017

[17] [17]

Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals . Nature 2022 603:7902 603, 654–660 (2022)

work page 2022

[18] [18]

Abd-Alrazaq, A. et al. Systematic review and meta-analysis of performance of wearable artiﬁcial intelligence in detecting and predicting depression . npj Digital Medicine 6, 84 (2023)

work page 2023

[19] [19]

Liu, J. J. et al. Digital phenotyping from wearables using AI characterizes psychiatric disorders and identiﬁes genetic associations . Cell 188, 515–529.e15 (2025)

work page 2025

[20] [20]

Xie, E. et al. JETS: A Self-Supervised Joint Embedding Time Series Foundation Model for Behavioral Data in Healthcare . in (2025)

work page 2025

[21] [21]

Estimating the reproducibility of psychological science

Open Science Collaboration. Estimating the reproducibility of psychological science . Science 349, aac4716 (2015)

work page 2015

[22] [22]

Eichstaedt, J. C. et al. Facebook language predicts depression in medical records . Proceed- ings of the National Academy of Sciences 115, 11203–11208 (2018)

work page 2018

[23] [23]

Kelley, S. W. et al. Using language in social media posts to study the network dynamics of depression longitudinally . Nature Communications 13, 870 (2022)

work page 2022

[24] [24]

Mirea, D.-M. et al. Cognitive modeling of real-world behavior for understanding mental health. Trends in Cognitive Sciences (2025) doi: 10.1016/j.tics.2025.07.009

work page doi:10.1016/j.tics.2025.07.009 2025

[25] [25]

Freeman, J. B. Doing Psychological Science by Hand . Current Directions in Psychological Science 27, 315–323 (2018)

work page 2018

[26] [26]

Jain, S. H. et al. The digital phenotype . Nature Biotechnology 33, 462–463 (2015)

work page 2015

[27] [27]

Insel, T. R. Digital Phenotyping: Technology for a New Science of Behavior . JAMA 318, 1215–1216 (2017)

work page 2017

[28] [28]

Wainberg, M. L. et al. Challenges and Opportunities in Global Mental Health: A Research- to-Practice Perspective . Current Psychiatry Reports 19, 28 (2017)

work page 2017

[29] [29]

Barrett, P. M. et al. Digitising the mind . The Lancet 389, 1877 (2017)

work page 2017

[30] [30]

Topol, E. J. High-performance medicine: The convergence of human and artiﬁcial intelli- gence. Nature Medicine 25, 44–56 (2019)

work page 2019

[31] [31]

Opportunities and challenges in the collection and analysis of digital pheno- typing data

Onnela, J.-P. Opportunities and challenges in the collection and analysis of digital pheno- typing data . Neuropsychopharmacology 46, 45–54 (2021)

work page 2021

[32] [32]

Hauser, T. U. et al. The promise of a model-based psychiatry: Building computational models of mental ill health . The Lancet Digital Health 4, e816–e828 (2022)

work page 2022

[33] [33]

Koutsouleris, N. et al. From promise to practice: Towards the realisation of AI-informed mental health care . The Lancet Digital Health 4, e829–e840 (2022)

work page 2022

[34] [34]

Galatzer-Levy, I. R. et al. Machine Learning and the Digital Measurement of Psychological Health. Annual Review of Clinical Psychology 19, 133–154 (2023)

work page 2023

[35] [35]

Picard, R. W. Aﬀective computing / Rosalind W. Picard. (MIT Press, 1997)

work page 1997

[36] [36]

Darwin, C. et al. The Expression of the Emotions in Man and Animals, Deﬁnitive Edition . (Oxford University Press, 1998). 75

work page 1998

[37] [37]

Emotional and Conversational Nonverbal Signals

Ekman, P. Emotional and Conversational Nonverbal Signals. in Language, Knowledge, and Representation (eds. Larrazabal, J. M. et al.) 39–50 (Springer Netherlands, 2004)

work page 2004

[38] [38]

Wolpert, D. M. et al. A unifying computational framework for motor control and social interaction. Philosophical Transactions of the Royal Society B: Biological Sciences 358, 593–602 (2003)

work page 2003

[39] [39]

Shadmehr, R. et al. Error correction, sensory prediction, and adaptation in motor control . Annual Review of Neuroscience 33, 89–108 (2010)

work page 2010

[40] [40]

Schoemann, M. et al. Using mouse cursor tracking to investigate online cognition: Preserving methodological ingenuity while moving toward reproducible science . Psychonomic Bulletin & Review 28, 766–787 (2021)

work page 2021

[41] [41]

Freihaut, P. et al. Tracking stress via the computer mouse? Promises and challenges of a potential behavioral stress marker . Behavior Research Methods 53, 2281–2301 (2021)

work page 2021

[42] [42]

De Angel, V. et al. Digital health tools for the passive monitoring of depression: A systematic review of methods . npj Digital Medicine 5, 3 (2022)

work page 2022

[43] [43]

Insel, T. et al. Research domain criteria (RDoC): Toward a new classiﬁcation framework for research on mental disorders . The American Journal of Psychiatry 167, 748–751 (2010)

work page 2010

[44] [44]

Kotov, R. et al. A paradigm shift in psychiatric classiﬁcation: The Hierarchical Taxonomy Of Psychopathology (HiTOP) . World Psychiatry 17, 24–25 (2018)

work page 2018

[45] [45]

Kılıç, A. A. et al. Bogazici mouse dynamics dataset . Data in Brief 36, 107094 (2021)

work page 2021

[46] [46]

Westerhof, G. J. et al. Mental Illness and Mental Health: The Two Continua Model Across the Lifespan . Journal of Adult Development 17, 110–119 (2010)

work page 2010

[47] [47]

Saragosa-Harris, N. M. et al. Real-World Exploration Increases Across Adolescence and Relates to Aﬀect, Risk Taking, and Social Connectivity . Psychological Science 33, 1664– 1679 (2022)

work page 2022

[48] [48]

Schurr, R. et al. Dynamic computational phenotyping of human cognition . Nature Human Behaviour 8, 917–931 (2024)

work page 2024

[49] [49]

So, S. H. et al. Jumping to conclusions data-gathering bias in psychosis and other psychiatric disorders — Two meta-analyses of comparisons between patients and healthy individuals . Clinical Psychology Review 46, 151–167 (2016)

work page 2016

[50] [50]

Gillan, C. M. et al. Smartphones and the Neuroscience of Mental Health . Annual Review of Neuroscience 44, 129–151 (2021)

work page 2021

[51] [51]

Kuppens, P. et al. Emotional inertia and psychological maladjustment. Psychological science 21, 984–991 (2010)

work page 2010

[52] [52]

Caspi, A. et al. All for One and One for All: Mental Disorders in One Dimension . American Journal of Psychiatry 175, 831–844 (2018)

work page 2018

[53] [53]

Golder, S. A. et al. Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures . Science 333, 1878–1881 (2011). 76

work page 2011

[54] [54]

Perris, F. et al. Duration of Untreated Illness in Patients with Obsessive-Compulsive Disor- der and Its Impact on Long-Term Outcome: A Systematic Review . Journal of Personalized Medicine 13, 1453 (2023)

work page 2023

[55] [55]

Obermeyer, Z. et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019)

work page 2019

[56] [56]

Keyes, K. M. et al. UK Biobank, big data, and the consequences of non-representativeness . Lancet (London, England) 393, 1297 (2019)

work page 2019

[57] [57]

Karvelis, P. et al. Individual diﬀerences in computational psychiatry: A review of current challenges. Neuroscience & Biobehavioral Reviews 148, 105137 (2023)

work page 2023

[58] [58]

Arena, A. F. et al. Mental health and unemployment: A systematic review and meta- analysis of interventions to improve depression and anxiety outcomes . Journal of Aﬀective Disorders 335, 450–472 (2023)

work page 2023

[59] [59]

Charles, S. T. et al. Social and Emotional Aging . Annual Review of Psychology 61, 383–409 (2010)

work page 2010

[60] [60]

Why is depression more common among women than among men? The Lancet Psychiatry 4, 146–158 (2017)

Kuehner, C. Why is depression more common among women than among men? The Lancet Psychiatry 4, 146–158 (2017)

work page 2017

[61] [61]

Trepka, E. et al. Entropy-based metrics for predicting choice behavior based on local re- sponse to reward . Nature Communications 12, 6567 (2021)

work page 2021

[62] [62]

Bennett, D. et al. The Two Cultures of Computational Psychiatry . JAMA Psychiatry 76, 563–564 (2019)

work page 2019

[63] [63]

Torous, J. et al. The growing ﬁeld of digital psychiatry: Current evidence and the future of apps, social media, chatbots, and virtual reality . World psychiatry: oﬃcial journal of the World Psychiatric Association (WPA) 20, 318–335 (2021)

work page 2021

[64] [64]

LeCun, Y. et al. Deep learning . Nature 521, 436–444 (2015)

work page 2015

[65] [65]

Lebowitz, M. S. et al. Testing positive for a genetic predisposition to depression magniﬁes retrospective memory for depressive symptoms . Journal of Consulting and Clinical Psychol- ogy 85, 1052–1063 (2017)

work page 2017

[66] [66]

Lekadir, K. et al. FUTURE-AI: International consensus guideline for trustworthy and de- ployable artiﬁcial intelligence in healthcare. (2025) doi: 10.1136/bmj-2024-081554

work page doi:10.1136/bmj-2024-081554 2025

[67] [67]

Derogatis, L. R. et al. The Brief Symptom Inventory: An introductory report . Psychological Medicine 13, 595–605 (1983)

work page 1983

[68] [68]

Keyes, C. L. M. et al. Evaluation of the mental health continuum-short form (MHC-SF) in setswana-speaking South Africans . Clinical Psychology & Psychotherapy 15, 181–192 (2008). 77

work page 2008