Understanding the Rising Human-AI Affective Bonding: Conceptualization and HAABI Scale Development
Pith reviewed 2026-06-29 05:49 UTC · model grok-4.3
The pith
A 20-item scale measures four core dimensions of emotional bonds users form with conversational AI.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Exploratory and confirmatory factor analyses support a 20-item, four-factor structure for the Human-AI Affective Bonding Inventory consisting of emotional realism, separation anxiety, emotional investment, and romantic intimacy; the scale demonstrates good reliability, construct validity, and known-groups validity, thereby supplying a user-centered instrument for examining how affective bonds with conversational AI develop and connect to psychological outcomes.
What carries the argument
The HAABI, a self-report inventory whose items were generated from thematic analysis of interviews and refined through exploratory and confirmatory factor analyses on survey data.
If this is right
- Researchers can now quantify levels of affective bonding and test associations with outcomes such as loneliness or well-being.
- The four-factor structure allows separate examination of each dimension rather than treating bonding as a single undifferentiated construct.
- The inventory supports known-groups comparisons, such as between frequent and infrequent AI users.
- Designers of conversational AI can use the scale to evaluate how interface changes affect users' reported bonding.
Where Pith is reading between the lines
- The romantic intimacy factor may require separate ethical scrutiny because it could influence expectations users bring to human relationships.
- Longitudinal use of the scale could reveal whether high bonding scores precede or follow changes in offline social activity.
- If the scale proves stable across languages, it could serve as a baseline for cross-cultural comparisons of AI attachment patterns.
Load-bearing premise
The cognitive, emotional, and behavioral features drawn from interviews with 52 emotionally engaged users accurately and comprehensively capture the main dimensions of human-AI affective bonding that should be measured by self-report.
What would settle it
A replication study with several hundred users from a different cultural or linguistic background that fails to recover the same four-factor structure or yields low internal consistency would indicate the scale does not generalize as claimed.
read the original abstract
As conversational AI becomes capable of sustained, affectively responsive interaction, users may form bonds beyond instrumental use. Existing measures often adapt interpersonal frameworks or focus on specific relational outcomes, leaving limited tools for assessing human-AI affective bonding on its own terms. Across two studies, we developed and validated the Human-AI Affective Bonding Inventory (HAABI). Study 1 used thematic analysis of semi-structured interviews with 52 emotionally engaged conversational AI users to identify cognitive, emotional, and behavioral features of bonding. Study 2 translated these insights into a self-report inventory and validated it among 673 Chinese conversational AI users. Exploratory and confirmatory factor analyses supported a 20-item, four-factor structure: emotional realism, separation anxiety, emotional investment, and romantic intimacy. The HAABI showed good reliability, construct validity, and known-groups validity. The scale therefore provides a neutral, user-centered tool for studying how affective bonds with conversational AI are formed, experienced, and related to users' psychological outcomes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to develop and validate the Human-AI Affective Bonding Inventory (HAABI), a 20-item self-report scale with a four-factor structure (emotional realism, separation anxiety, emotional investment, romantic intimacy). Study 1 derives cognitive, emotional, and behavioral features via thematic analysis of semi-structured interviews with 52 emotionally engaged conversational AI users. Study 2 translates these into items and validates the scale via EFA and CFA among 673 Chinese users, reporting good reliability, construct validity, and known-groups validity. The scale is positioned as a neutral, user-centered tool for studying formation, experience, and psychological correlates of human-AI affective bonds.
Significance. If the empirical results hold, the HAABI would fill a gap by offering a dedicated measure for affective bonding with conversational AI rather than relying on adapted interpersonal or outcome-specific instruments. This is timely for HCI research on emerging user-AI relationships and could support studies linking bond dimensions to well-being or usage patterns. The two-study mixed-methods approach (qualitative foundation plus quantitative validation) is standard for scale development and, if fully documented, would strengthen the contribution.
major comments (4)
- [Study 1] Study 1: The thematic analysis section provides no details on the item-generation process, including the number of initial items extracted, coding scheme, number of coders, inter-rater reliability metrics, or how themes were mapped to the eventual 20 items. This information is load-bearing for evaluating whether the four-factor structure comprehensively and accurately captures the reported features from the 52 interviews.
- [Study 2] Study 2: No CFA fit statistics (e.g., CFI, TLI, RMSEA, SRMR, or chi-square/df) or EFA variance explained are reported, despite the claim that analyses 'supported' the 20-item four-factor structure. Without these values it is impossible to assess model adequacy or compare to standard thresholds.
- [Study 2] Study 2: Sample description is limited to nationality ('Chinese conversational AI users'); age, gender, education, frequency/duration of AI use, and exclusion criteria are not reported. These details are required to evaluate selection bias, generalizability, and the appropriateness of the known-groups validity tests.
- [Study 2] Methods/Results: Specific statistical evidence for construct validity and known-groups validity (e.g., correlation coefficients, group comparison tests, p-values, effect sizes) is absent from the reported results, weakening the ability to judge the strength of the validity claims.
minor comments (1)
- [Abstract] The abstract could usefully include the reliability coefficients (e.g., Cronbach's alpha or omega) and at least one key fit index to allow readers to gauge the quantitative support immediately.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback, which has helped us improve the clarity and transparency of the manuscript. We address each major comment below and have revised the manuscript to incorporate the requested information where it was previously omitted.
read point-by-point responses
-
Referee: [Study 1] Study 1: The thematic analysis section provides no details on the item-generation process, including the number of initial items extracted, coding scheme, number of coders, inter-rater reliability metrics, or how themes were mapped to the eventual 20 items. This information is load-bearing for evaluating whether the four-factor structure comprehensively and accurately captures the reported features from the 52 interviews.
Authors: We agree that these details are essential for evaluating the rigor of the qualitative phase. In the revised manuscript we have substantially expanded the Study 1 Methods and Results sections to describe the full item-generation process. This now includes the number of initial items extracted, the coding scheme employed, the number of coders involved, inter-rater reliability statistics, and the explicit mapping from identified themes to the final 20 items retained in the HAABI. revision: yes
-
Referee: [Study 2] Study 2: No CFA fit statistics (e.g., CFI, TLI, RMSEA, SRMR, or chi-square/df) or EFA variance explained are reported, despite the claim that analyses 'supported' the 20-item four-factor structure. Without these values it is impossible to assess model adequacy or compare to standard thresholds.
Authors: We acknowledge the omission. The revised manuscript now reports the complete set of CFA fit indices (CFI, TLI, RMSEA, SRMR, and χ^{2}/df) together with the percentage of variance explained by the EFA. These values are presented in the Results section of Study 2 and meet conventional thresholds for acceptable model fit. revision: yes
-
Referee: [Study 2] Study 2: Sample description is limited to nationality ('Chinese conversational AI users'); age, gender, education, frequency/duration of AI use, and exclusion criteria are not reported. These details are required to evaluate selection bias, generalizability, and the appropriateness of the known-groups validity tests.
Authors: We have revised the Participants subsection of Study 2 to provide a full demographic and usage profile, including age, gender, education level, frequency and duration of conversational AI use, and all exclusion criteria applied. These additions allow readers to better judge selection bias and generalizability. revision: yes
-
Referee: [Study 2] Methods/Results: Specific statistical evidence for construct validity and known-groups validity (e.g., correlation coefficients, group comparison tests, p-values, effect sizes) is absent from the reported results, weakening the ability to judge the strength of the validity claims.
Authors: We agree that the original text lacked the quantitative details needed to evaluate the validity claims. The revised Results section now includes the specific correlation coefficients, statistical tests, p-values, and effect sizes for both construct validity and known-groups validity analyses. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper reports a standard empirical scale-development process: thematic analysis of 52 interviews to extract features, followed by EFA/CFA on 673 users to validate a 20-item four-factor structure with reliability and validity checks. No equations, fitted parameters renamed as predictions, self-citations forming load-bearing premises, or ansatzes smuggled via prior work appear in the provided text or abstract. The central claims rest on data-driven factor analysis and known-groups validation applied to newly collected responses, remaining self-contained without any reduction of outputs to inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Thematic analysis of semi-structured interviews with 52 users identifies the cognitive, emotional, and behavioral features of human-AI affective bonding.
- domain assumption Self-report responses from Chinese conversational AI users can be used to validate a general measure of affective bonding.
Reference graph
Works this paper leans on
-
[1]
The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006)
were deleted because of low I-CVI values and expert consensus on their redundancy or insufficient specificity. The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006). 3.1.2 Participants and Procedure Data were collected through an onli...
2006
-
[2]
I often want the AI to express intimacy and commitment to me
and a CFA sample (n = 237). This split-sample approach enabled the cross-validation of the factor structure identified through EFA on an independent dataset (Worthington & Whittaker, 2006). The two subsamples did not differ significantly in terms of age, sex distribution, or relationship type distribution, indicating successful randomization. 3.1.3 Measur...
2006
-
[3]
The AI adjusts its responses according to my emotional changes
that includes the factors of emotional responsiveness (“The AI adjusts its responses according to my emotional changes”), warmth (“The AI demonstrates warmth and support through friendly and sincere communication”), depth of understanding (“The AI goes beyond surface content to understand my inner feelings and thoughts”), and nonjudgmental acceptance (“Th...
2010
-
[4]
Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported
in R 4.5.1 with the lavaan package. Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported. Model fit was evaluated using the following established criteria: χ²/df < 3.0, CFI > .90, TLI > .90, RMSEA < .08, and SRMR < .08 (Hair et al., 2019; Hu & Bentler, 1999; Kline, 2023). To examine whether the HAABI...
2019
-
[5]
In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes
The transition from the six qualitative dimensions to the four-factor scale is theoretically meaningful. In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes. In Study 2, these themes converged into emotional realism, suggesting that users’ sense of an AI as agentic and their sense of the relationship as gen...
-
[6]
https://doi.org/10.1038/s44184-023-00047-6 Markus, H. R., & Kitayama, S. (2014). Culture and the self: Implications for cognition, emotion, and motivation. In College student development and academic life (pp. 264-293). Routledge. McDonald, R. P. (2013). Test theory: A unified treatment. psychology press. Nass, C., & Moon, Y. (2000). Machines and Mindless...
-
[7]
https://doi.org/10.1111/0022-4537.00153 Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 72–78. https://doi.org/10.1145/191666.191703 Ng, P. M. L., Wan, C., Lee, D., Garnelo-Gomez, I., & Lau, M. M. (2025). I love you, my AI companion! Do you? Perspectives...
-
[8]
TechCrunch. https://techcrunch.com/2025/08/12/ai-companion-apps-on-track-to-pull-in-120m-in-2025/ Polit, D. F., & Beck, C. T. (2006). The content validity index: are you sure you know what's being reported? Critique and recommendations. Research in nursing & health, 29(5), 489-497. Rabb, N., Law, T., Chita-Tegmark, M., & Scheutz, M. (2022). An Attachment ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.