Discourse Behavior of Older Adults Interacting With a Dialogue Agent Competent in Multiple Topics

Kimberly A. Van Orden; Lenhart K. Schubert; Mohammad Rafayet Ali; S. Zahra Razavi

arxiv: 1907.06279 · v1 · pith:SPFZRKD5new · submitted 2019-07-14 · 💻 cs.HC · cs.AI· cs.CY

Discourse Behavior of Older Adults Interacting With a Dialogue Agent Competent in Multiple Topics

S. Zahra Razavi , Lenhart K. Schubert , Kimberly A. Van Orden , Mohammad Rafayet Ali This is my paper

Pith reviewed 2026-05-24 21:19 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CY

keywords older adultsdialogue agentmulti-topic dialogueverbosityself-disclosuresentimenthuman-computer interaction

0 comments

The pith

Older adults interacting with a multi-topic avatar exhibit greater verbosity and self-disclosure on intimate topics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes dialogues between older adults and an avatar capable of discussing 27 topics across three intimacy levels over multiple sessions. It identifies patterns including more talk on difficult topics, growing verbosity across sessions, stronger emotions on life goals, and more personal sharing on close subjects. These findings matter because they demonstrate how users respond to a system designed for varied everyday conversations. If accurate, they support the idea that such avatars can sustain engaging interactions with this demographic.

Core claim

Analysis of the dialogues reveals correlations such as greater verbosity for more difficult topics, increasing verbosity with successive sessions, especially for more difficult topics, stronger sentiment on topics concerned with life goals rather than routine activities, and stronger self-disclosure for more intimate topics. These results reflect positively on the sophistication of the dialogue system.

What carries the argument

The dialogue avatar competent in 27 topics divided into three groups by degrees of intimacy, used to track changes in user verbosity, sentiment, and self-disclosure over repeated sessions.

Load-bearing premise

The three topic groups accurately represent degrees of intimacy that influence user responses, and the avatar's capabilities primarily cause the observed dialogue patterns rather than other influences like user selection or repeated exposure.

What would settle it

A study finding no variation in verbosity or self-disclosure levels when the same users discuss topics from different intimacy groups would challenge the correlations.

read the original abstract

We present some results concerning the dialogue behavior and inferred sentiment of a group of older adults interacting with a computer-based avatar. Our avatar is unique in its ability to hold natural dialogues on a wide range of everyday topics---27 topics in three groups, developed with the help of gerontologists. The three groups vary in ``degrees of intimacy", and as such in degrees of difficulty for the user. Each participant interacted with the avatar for 7-9 sessions over a period of 3-4 weeks; analysis of the dialogues reveals correlations such as greater verbosity for more difficult topics, increasing verbosity with successive sessions, especially for more difficult topics, stronger sentiment on topics concerned with life goals rather than routine activities, and stronger self-disclosure for more intimate topics. In addition to their intrinsic interest, these results also reflect positively on the sophistication of our dialogue system.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This reports some dialogue patterns from older adults using a 27-topic avatar over multiple sessions, but the links to topic intimacy and system quality rest on uncontrolled observations.

read the letter

The paper collects multi-session dialogue data from older adults interacting with an avatar competent across 27 topics split into three intimacy groups. Participants did 7-9 sessions over 3-4 weeks, and the authors note patterns such as higher verbosity on harder topics, rising verbosity across sessions especially on difficult ones, stronger sentiment on life-goal topics, and more self-disclosure on intimate ones. They suggest these findings also speak well of their dialogue system. What is new is the concrete empirical record from this population and topic set; the gerontologist input on topic design is a reasonable practical step. The work is straightforward observational HCI and stays within its scope without overreaching into theory. The soft spots are the missing controls. Nothing in the abstract or the described analysis rules out session-order effects or individual differences as drivers of the verbosity and disclosure increases, so the claim that the patterns reflect topic intimacy or system sophistication stays under-supported. Sample size, statistical tests, and exclusion criteria are also absent, which makes it impossible to gauge how reliable the correlations are. This is for researchers already working on dialogue agents for aging populations who want examples of topic grouping and simple dialogue measures. A reader in that niche could extract some design ideas, but the paper does not supply enough to change methods or support strong claims. I would send it to peer review if the full methods section supplies the numbers and any attempt at mixed-effects modeling or counterbalancing; otherwise the evidence base is too thin for serious attention.

Referee Report

2 major / 1 minor

Summary. The paper reports an observational study of older adults interacting over 7-9 sessions with a dialogue avatar competent in 27 topics grouped into three intimacy/difficulty levels. It claims to find correlations including greater verbosity on more difficult topics and increasing verbosity across sessions (especially on hard topics), stronger sentiment on life-goal versus routine topics, and greater self-disclosure on intimate topics; these patterns are interpreted as intrinsically interesting and as positive evidence of the dialogue system's sophistication.

Significance. If the correlations prove statistically robust after appropriate controls, the work could inform the design of multi-topic dialogue systems for older users by linking topic intimacy to observable dialogue behaviors. The multi-session, multi-topic setup is a strength, but the current presentation supplies no quantitative support for the claims.

major comments (2)

[Abstract] Abstract: the central claims rest on the existence of specific correlations (verbosity by topic difficulty, verbosity increase over sessions, sentiment by topic type, self-disclosure by intimacy), yet the abstract supplies neither sample size, statistical tests, confidence intervals, nor any description of how the three topic groups were validated as intimacy gradients independent of other factors.
[Abstract] Abstract / Results: the interpretation that the observed patterns 'reflect positively on the sophistication of our dialogue system' requires that topic-group assignment and session-order effects have been disentangled; the manuscript provides no evidence of mixed-effects modeling, counterbalancing, or controls for individual differences or familiarity with the avatar, leaving the attribution vulnerable to confounds.

minor comments (1)

[Abstract] The abstract refers to 'inferred sentiment' and 'self-disclosure' without defining the operational measures or annotation procedures used to extract these features from the dialogues.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed review and constructive feedback. We address each major comment below, proposing revisions to strengthen the manuscript where the concerns are valid.

read point-by-point responses

Referee: [Abstract] Abstract: the central claims rest on the existence of specific correlations (verbosity by topic difficulty, verbosity increase over sessions, sentiment by topic type, self-disclosure by intimacy), yet the abstract supplies neither sample size, statistical tests, confidence intervals, nor any description of how the three topic groups were validated as intimacy gradients independent of other factors.

Authors: We agree the abstract is too concise and omits key quantitative details. The topic groups were developed in consultation with gerontologists to reflect increasing intimacy/difficulty, but the abstract does not describe this process or report sample size, tests, or intervals. In revision we will expand the abstract to include sample size, note the primary statistical tests (e.g., repeated-measures ANOVA or regression), briefly describe the gerontologist-guided validation of the intimacy gradient, and add any feasible confidence intervals within length limits. revision: yes
Referee: [Abstract] Abstract / Results: the interpretation that the observed patterns 'reflect positively on the sophistication of our dialogue system' requires that topic-group assignment and session-order effects have been disentangled; the manuscript provides no evidence of mixed-effects modeling, counterbalancing, or controls for individual differences or familiarity with the avatar, leaving the attribution vulnerable to confounds.

Authors: The referee is correct: the manuscript presents an observational study without mixed-effects models, counterbalancing of topic order, or explicit controls for individual differences or avatar familiarity. Session order was fixed by design, and topic assignment followed the pre-defined intimacy groups. We will revise the interpretation section and abstract to remove or qualify the claim that the patterns reflect positively on the dialogue system's sophistication, presenting the results instead as descriptive behavioral patterns of interest for multi-topic dialogue design. We will also add a limitations paragraph acknowledging the absence of these controls. revision: yes

Circularity Check

0 steps flagged

No significant circularity: purely observational empirical reporting

full rationale

The paper reports measured correlations from dialogue transcripts (verbosity by topic difficulty, sentiment by topic type, self-disclosure by intimacy) without any equations, parameter fitting, predictions, or derivation steps. Topic groupings are presented as input categories developed with gerontologists rather than outputs derived from the data. No self-citations, uniqueness theorems, or ansatzes are invoked to support central claims. The interpretive remark that results 'reflect positively on the sophistication of our dialogue system' is post-hoc commentary, not a load-bearing reduction to fitted inputs or self-definitions. The analysis is self-contained against external benchmarks as direct observation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model or derivation is present; all claims rest on the unelaborated user-study data collection and analysis described in the abstract.

pith-pipeline@v0.9.0 · 5695 in / 1037 out tokens · 25667 ms · 2026-05-24T21:19:42.298382+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

analysis of the dialogues reveals correlations such as greater verbosity for more difficult topics, increasing verbosity with successive sessions... stronger sentiment on topics concerned with life goals... stronger self-disclosure for more intimate topics
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The three groups vary in “degrees of intimacy”, and as such in degrees of difficulty for the user

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.