Understanding the Rising Human-AI Affective Bonding: Conceptualization and HAABI Scale Development

Anji Zhou; Chenxi Wang; Fenghua Tang; Lu Chen; Mengyu Miranda Gao; Rongqi Ding; Xiaoran Xue; Zhuo Rachel Han

arxiv: 2605.29484 · v1 · pith:FE2NKH2Dnew · submitted 2026-05-28 · 💻 cs.HC

Understanding the Rising Human-AI Affective Bonding: Conceptualization and HAABI Scale Development

Lu Chen , Xiaoran Xue , Rongqi Ding , Fenghua Tang , Anji Zhou , Chenxi Wang , Mengyu Miranda Gao , Zhuo Rachel Han This is my paper

Pith reviewed 2026-06-29 05:49 UTC · model grok-4.3

classification 💻 cs.HC

keywords human-AI affective bondingconversational AIscale developmentfactor analysisemotional attachmentpsychometric validationuser-centered measurementAI relationships

0 comments

The pith

A 20-item scale measures four core dimensions of emotional bonds users form with conversational AI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to build a dedicated measurement tool for affective connections between people and AI chatbots, instead of adapting scales from human relationships. Interviews with 52 engaged users surfaced recurring cognitive, emotional, and behavioral patterns that were turned into questionnaire items and tested on 673 users through factor analysis. The resulting inventory distinguishes emotional realism, separation anxiety, emotional investment, and romantic intimacy, each showing solid reliability and validity. A sympathetic reader would care because the scale gives researchers a neutral way to track how these bonds arise, persist, and relate to users' broader psychological states without assuming the bonds are merely substitutes for human ties.

Core claim

Exploratory and confirmatory factor analyses support a 20-item, four-factor structure for the Human-AI Affective Bonding Inventory consisting of emotional realism, separation anxiety, emotional investment, and romantic intimacy; the scale demonstrates good reliability, construct validity, and known-groups validity, thereby supplying a user-centered instrument for examining how affective bonds with conversational AI develop and connect to psychological outcomes.

What carries the argument

The HAABI, a self-report inventory whose items were generated from thematic analysis of interviews and refined through exploratory and confirmatory factor analyses on survey data.

If this is right

Researchers can now quantify levels of affective bonding and test associations with outcomes such as loneliness or well-being.
The four-factor structure allows separate examination of each dimension rather than treating bonding as a single undifferentiated construct.
The inventory supports known-groups comparisons, such as between frequent and infrequent AI users.
Designers of conversational AI can use the scale to evaluate how interface changes affect users' reported bonding.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The romantic intimacy factor may require separate ethical scrutiny because it could influence expectations users bring to human relationships.
Longitudinal use of the scale could reveal whether high bonding scores precede or follow changes in offline social activity.
If the scale proves stable across languages, it could serve as a baseline for cross-cultural comparisons of AI attachment patterns.

Load-bearing premise

The cognitive, emotional, and behavioral features drawn from interviews with 52 emotionally engaged users accurately and comprehensively capture the main dimensions of human-AI affective bonding that should be measured by self-report.

What would settle it

A replication study with several hundred users from a different cultural or linguistic background that fails to recover the same four-factor structure or yields low internal consistency would indicate the scale does not generalize as claimed.

read the original abstract

As conversational AI becomes capable of sustained, affectively responsive interaction, users may form bonds beyond instrumental use. Existing measures often adapt interpersonal frameworks or focus on specific relational outcomes, leaving limited tools for assessing human-AI affective bonding on its own terms. Across two studies, we developed and validated the Human-AI Affective Bonding Inventory (HAABI). Study 1 used thematic analysis of semi-structured interviews with 52 emotionally engaged conversational AI users to identify cognitive, emotional, and behavioral features of bonding. Study 2 translated these insights into a self-report inventory and validated it among 673 Chinese conversational AI users. Exploratory and confirmatory factor analyses supported a 20-item, four-factor structure: emotional realism, separation anxiety, emotional investment, and romantic intimacy. The HAABI showed good reliability, construct validity, and known-groups validity. The scale therefore provides a neutral, user-centered tool for studying how affective bonds with conversational AI are formed, experienced, and related to users' psychological outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper builds a new 20-item scale for human-AI affective bonding from interviews and factor analysis on Chinese users.

read the letter

The main point is a dedicated scale, HAABI, with four factors—emotional realism, separation anxiety, emotional investment, and romantic intimacy—developed specifically for bonds with conversational AI rather than adapted from interpersonal measures.

The work follows a standard two-study path: thematic analysis on 52 emotionally engaged users to surface features, then EFA and CFA on 673 Chinese users to confirm the structure, plus checks for reliability, construct validity, and known-groups validity. That gives researchers a ready instrument for collecting data on this topic, which is new enough that prior tools were mostly borrowed or narrow.

The approach is straightforward and the abstract reports the factor solution held up, which is the core deliverable. Credit for grounding the items in actual user descriptions instead of theory alone.

Soft spots are the usual ones for scale papers at this stage. The abstract leaves out item-generation details, exact fit indices, exclusion rules, and demographics beyond nationality, so it's hard to judge how robust the qualitative base really is or how well the four factors generalize past the Chinese sample. The claim that those interview themes cover the main dimensions rests on the thematic analysis being comprehensive, which we can't fully check here.

This is for HCI and affective computing people who need a measurement tool for AI relationships. It is not resolving a big theoretical question but supplies a practical starting point.

It deserves peer review because the process is conventional and the output is usable if the methods hold up under scrutiny.

Referee Report

4 major / 1 minor

Summary. The paper claims to develop and validate the Human-AI Affective Bonding Inventory (HAABI), a 20-item self-report scale with a four-factor structure (emotional realism, separation anxiety, emotional investment, romantic intimacy). Study 1 derives cognitive, emotional, and behavioral features via thematic analysis of semi-structured interviews with 52 emotionally engaged conversational AI users. Study 2 translates these into items and validates the scale via EFA and CFA among 673 Chinese users, reporting good reliability, construct validity, and known-groups validity. The scale is positioned as a neutral, user-centered tool for studying formation, experience, and psychological correlates of human-AI affective bonds.

Significance. If the empirical results hold, the HAABI would fill a gap by offering a dedicated measure for affective bonding with conversational AI rather than relying on adapted interpersonal or outcome-specific instruments. This is timely for HCI research on emerging user-AI relationships and could support studies linking bond dimensions to well-being or usage patterns. The two-study mixed-methods approach (qualitative foundation plus quantitative validation) is standard for scale development and, if fully documented, would strengthen the contribution.

major comments (4)

[Study 1] Study 1: The thematic analysis section provides no details on the item-generation process, including the number of initial items extracted, coding scheme, number of coders, inter-rater reliability metrics, or how themes were mapped to the eventual 20 items. This information is load-bearing for evaluating whether the four-factor structure comprehensively and accurately captures the reported features from the 52 interviews.
[Study 2] Study 2: No CFA fit statistics (e.g., CFI, TLI, RMSEA, SRMR, or chi-square/df) or EFA variance explained are reported, despite the claim that analyses 'supported' the 20-item four-factor structure. Without these values it is impossible to assess model adequacy or compare to standard thresholds.
[Study 2] Study 2: Sample description is limited to nationality ('Chinese conversational AI users'); age, gender, education, frequency/duration of AI use, and exclusion criteria are not reported. These details are required to evaluate selection bias, generalizability, and the appropriateness of the known-groups validity tests.
[Study 2] Methods/Results: Specific statistical evidence for construct validity and known-groups validity (e.g., correlation coefficients, group comparison tests, p-values, effect sizes) is absent from the reported results, weakening the ability to judge the strength of the validity claims.

minor comments (1)

[Abstract] The abstract could usefully include the reliability coefficients (e.g., Cronbach's alpha or omega) and at least one key fit index to allow readers to gauge the quantitative support immediately.

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which has helped us improve the clarity and transparency of the manuscript. We address each major comment below and have revised the manuscript to incorporate the requested information where it was previously omitted.

read point-by-point responses

Referee: [Study 1] Study 1: The thematic analysis section provides no details on the item-generation process, including the number of initial items extracted, coding scheme, number of coders, inter-rater reliability metrics, or how themes were mapped to the eventual 20 items. This information is load-bearing for evaluating whether the four-factor structure comprehensively and accurately captures the reported features from the 52 interviews.

Authors: We agree that these details are essential for evaluating the rigor of the qualitative phase. In the revised manuscript we have substantially expanded the Study 1 Methods and Results sections to describe the full item-generation process. This now includes the number of initial items extracted, the coding scheme employed, the number of coders involved, inter-rater reliability statistics, and the explicit mapping from identified themes to the final 20 items retained in the HAABI. revision: yes
Referee: [Study 2] Study 2: No CFA fit statistics (e.g., CFI, TLI, RMSEA, SRMR, or chi-square/df) or EFA variance explained are reported, despite the claim that analyses 'supported' the 20-item four-factor structure. Without these values it is impossible to assess model adequacy or compare to standard thresholds.

Authors: We acknowledge the omission. The revised manuscript now reports the complete set of CFA fit indices (CFI, TLI, RMSEA, SRMR, and χ^{2}/df) together with the percentage of variance explained by the EFA. These values are presented in the Results section of Study 2 and meet conventional thresholds for acceptable model fit. revision: yes
Referee: [Study 2] Study 2: Sample description is limited to nationality ('Chinese conversational AI users'); age, gender, education, frequency/duration of AI use, and exclusion criteria are not reported. These details are required to evaluate selection bias, generalizability, and the appropriateness of the known-groups validity tests.

Authors: We have revised the Participants subsection of Study 2 to provide a full demographic and usage profile, including age, gender, education level, frequency and duration of conversational AI use, and all exclusion criteria applied. These additions allow readers to better judge selection bias and generalizability. revision: yes
Referee: [Study 2] Methods/Results: Specific statistical evidence for construct validity and known-groups validity (e.g., correlation coefficients, group comparison tests, p-values, effect sizes) is absent from the reported results, weakening the ability to judge the strength of the validity claims.

Authors: We agree that the original text lacked the quantitative details needed to evaluate the validity claims. The revised Results section now includes the specific correlation coefficients, statistical tests, p-values, and effect sizes for both construct validity and known-groups validity analyses. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper reports a standard empirical scale-development process: thematic analysis of 52 interviews to extract features, followed by EFA/CFA on 673 users to validate a 20-item four-factor structure with reliability and validity checks. No equations, fitted parameters renamed as predictions, self-citations forming load-bearing premises, or ansatzes smuggled via prior work appear in the provided text or abstract. The central claims rest on data-driven factor analysis and known-groups validation applied to newly collected responses, remaining self-contained without any reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on the assumption that thematic analysis of a modest interview sample yields the definitive feature set for affective bonding and that self-report items derived from it validly capture the construct in a new population.

axioms (2)

domain assumption Thematic analysis of semi-structured interviews with 52 users identifies the cognitive, emotional, and behavioral features of human-AI affective bonding.
This step directly generates the item pool for the scale in Study 1.
domain assumption Self-report responses from Chinese conversational AI users can be used to validate a general measure of affective bonding.
Study 2 relies on this to establish reliability and validity.

pith-pipeline@v0.9.1-grok · 5723 in / 1419 out tokens · 25784 ms · 2026-06-29T05:49:37.336646+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

8 extracted references · 4 canonical work pages

[1]

The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006)

were deleted because of low I-CVI values and expert consensus on their redundancy or insufficient specificity. The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006). 3.1.2 Participants and Procedure Data were collected through an onli...

2006
[2]

I often want the AI to express intimacy and commitment to me

and a CFA sample (n = 237). This split-sample approach enabled the cross-validation of the factor structure identified through EFA on an independent dataset (Worthington & Whittaker, 2006). The two subsamples did not differ significantly in terms of age, sex distribution, or relationship type distribution, indicating successful randomization. 3.1.3 Measur...

2006
[3]

The AI adjusts its responses according to my emotional changes

that includes the factors of emotional responsiveness (“The AI adjusts its responses according to my emotional changes”), warmth (“The AI demonstrates warmth and support through friendly and sincere communication”), depth of understanding (“The AI goes beyond surface content to understand my inner feelings and thoughts”), and nonjudgmental acceptance (“Th...

2010
[4]

Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported

in R 4.5.1 with the lavaan package. Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported. Model fit was evaluated using the following established criteria: χ²/df < 3.0, CFI > .90, TLI > .90, RMSEA < .08, and SRMR < .08 (Hair et al., 2019; Hu & Bentler, 1999; Kline, 2023). To examine whether the HAABI...

2019
[5]

In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes

The transition from the six qualitative dimensions to the four-factor scale is theoretically meaningful. In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes. In Study 2, these themes converged into emotional realism, suggesting that users’ sense of an AI as agentic and their sense of the relationship as gen...

work page doi:10.1038/d41586-025-01349-9 2025
[6]

R., & Kitayama, S

https://doi.org/10.1038/s44184-023-00047-6 Markus, H. R., & Kitayama, S. (2014). Culture and the self: Implications for cognition, emotion, and motivation. In College student development and academic life (pp. 264-293). Routledge. McDonald, R. P. (2013). Test theory: A unified treatment. psychology press. Nass, C., & Moon, Y. (2000). Machines and Mindless...

work page doi:10.1038/s44184-023-00047-6 2014
[7]

My Boyfriend is AI

https://doi.org/10.1111/0022-4537.00153 Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 72–78. https://doi.org/10.1145/191666.191703 Ng, P. M. L., Wan, C., Lee, D., Garnelo-Gomez, I., & Lau, M. M. (2025). I love you, my AI companion! Do you? Perspectives...

work page doi:10.1111/0022-4537.00153 1994
[8]

We Share an Unbreakable Bond:

TechCrunch. https://techcrunch.com/2025/08/12/ai-companion-apps-on-track-to-pull-in-120m-in-2025/ Polit, D. F., & Beck, C. T. (2006). The content validity index: are you sure you know what's being reported? Critique and recommendations. Research in nursing & health, 29(5), 489-497. Rabb, N., Law, T., Chita-Tegmark, M., & Scheutz, M. (2022). An Attachment ...

work page doi:10.1007/s12369-021-00802-9 2025

[1] [1]

The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006)

were deleted because of low I-CVI values and expert consensus on their redundancy or insufficient specificity. The final item pool comprised 49 items, with an S-CVI/Ave of .92 for relevance and .94 for clarity, both of which exceeded the recommended threshold of .90 (Polit & Beck, 2006). 3.1.2 Participants and Procedure Data were collected through an onli...

2006

[2] [2]

I often want the AI to express intimacy and commitment to me

and a CFA sample (n = 237). This split-sample approach enabled the cross-validation of the factor structure identified through EFA on an independent dataset (Worthington & Whittaker, 2006). The two subsamples did not differ significantly in terms of age, sex distribution, or relationship type distribution, indicating successful randomization. 3.1.3 Measur...

2006

[3] [3]

The AI adjusts its responses according to my emotional changes

that includes the factors of emotional responsiveness (“The AI adjusts its responses according to my emotional changes”), warmth (“The AI demonstrates warmth and support through friendly and sincere communication”), depth of understanding (“The AI goes beyond surface content to understand my inner feelings and thoughts”), and nonjudgmental acceptance (“Th...

2010

[4] [4]

Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported

in R 4.5.1 with the lavaan package. Maximum likelihood estimation with robust standard errors was employed, and robust fit indices were reported. Model fit was evaluated using the following established criteria: χ²/df < 3.0, CFI > .90, TLI > .90, RMSEA < .08, and SRMR < .08 (Hair et al., 2019; Hu & Bentler, 1999; Kline, 2023). To examine whether the HAABI...

2019

[5] [5]

In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes

The transition from the six qualitative dimensions to the four-factor scale is theoretically meaningful. In Study 1, perceived agency and perceived relationship were identified as separate cognitive themes. In Study 2, these themes converged into emotional realism, suggesting that users’ sense of an AI as agentic and their sense of the relationship as gen...

work page doi:10.1038/d41586-025-01349-9 2025

[6] [6]

R., & Kitayama, S

https://doi.org/10.1038/s44184-023-00047-6 Markus, H. R., & Kitayama, S. (2014). Culture and the self: Implications for cognition, emotion, and motivation. In College student development and academic life (pp. 264-293). Routledge. McDonald, R. P. (2013). Test theory: A unified treatment. psychology press. Nass, C., & Moon, Y. (2000). Machines and Mindless...

work page doi:10.1038/s44184-023-00047-6 2014

[7] [7]

My Boyfriend is AI

https://doi.org/10.1111/0022-4537.00153 Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 72–78. https://doi.org/10.1145/191666.191703 Ng, P. M. L., Wan, C., Lee, D., Garnelo-Gomez, I., & Lau, M. M. (2025). I love you, my AI companion! Do you? Perspectives...

work page doi:10.1111/0022-4537.00153 1994

[8] [8]

We Share an Unbreakable Bond:

TechCrunch. https://techcrunch.com/2025/08/12/ai-companion-apps-on-track-to-pull-in-120m-in-2025/ Polit, D. F., & Beck, C. T. (2006). The content validity index: are you sure you know what's being reported? Critique and recommendations. Research in nursing & health, 29(5), 489-497. Rabb, N., Law, T., Chita-Tegmark, M., & Scheutz, M. (2022). An Attachment ...

work page doi:10.1007/s12369-021-00802-9 2025