Social Story Frames: Contextual Reasoning about Narrative Intent and Reception

Achyutarama R. Ganti; Andrew Piper; Joel Mire; Maarten Sap; Maria Antoniak; Steven R. Wilson; Zexin Ma

arxiv: 2512.15925 · v2 · submitted 2025-12-17 · 💻 cs.CL · cs.AI· cs.LG· cs.SI

Social Story Frames: Contextual Reasoning about Narrative Intent and Reception

Joel Mire , Maria Antoniak , Steven R. Wilson , Zexin Ma , Achyutarama R. Ganti , Andrew Piper , Maarten Sap This is my paper

Pith reviewed 2026-05-16 21:14 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LGcs.SI

keywords reader responsenarrative intentsocial media storiescontextual reasoningcomputational pragmaticsstorytelling analysisaffective response modeling

0 comments

The pith

SocialStoryFrames distills plausible reader inferences about story intent, affect, and judgments from conversational context using a narrative taxonomy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SocialStoryFrames as a formalism that captures inferences readers make about narratives, including perceived author intent, explanatory and predictive reasoning, emotional responses, and value judgments. It combines conversational context with a taxonomy drawn from narrative theory, pragmatics, and psychology to enable computational modeling of these responses. Two models, SSF-Generator and SSF-Classifier, are built and tested through human surveys with 382 participants and expert annotations. The approach is then applied to a corpus of 6,140 social media stories to measure how often different intents appear, how they relate to one another, and how narrative practices differ across online communities. This framework supports large-scale study of storytelling reception that current models cannot handle in a context-sensitive way.

Core claim

We introduce SocialStoryFrames, a formalism for distilling plausible inferences about reader response, such as perceived author intent, explanatory and predictive reasoning, affective responses, and value judgments, using conversational context and a taxonomy grounded in narrative theory, linguistic pragmatics, and psychology. We develop two models, SSF-Generator and SSF-Classifier, validated through human surveys and expert annotations, and apply them to a corpus of 6,140 social media stories to characterize storytelling intents and compare practices across communities.

What carries the argument

SocialStoryFrames, the formalism and taxonomy that links conversational context to inferences about reader responses including intent, reasoning, affect, and judgments.

If this is right

Characterizes the frequency and interdependence of storytelling intents across social media stories.
Compares and contrasts narrative practices and their diversity across different online communities.
Enables new research into storytelling at scale by connecting fine-grained context modeling with a generic taxonomy of reader responses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same formalism could be adapted to study reader responses in longer-form fiction or news narratives if the context window is extended.
Predicted reader frames might be used to guide story generation systems toward responses the author intends.
Patterns of intent interdependence could reveal how community norms shape what counts as a successful story.

Load-bearing premise

The taxonomy accurately captures the full range of real human reader responses to stories and the trained models generalize beyond the surveyed participants and annotated data.

What would settle it

A new survey where participants read held-out stories, report their own inferences about intent and responses, and the SSF-Classifier outputs are checked for agreement with those reports.

Figures

Figures reproduced from arXiv: 2512.15925 by Achyutarama R. Ganti, Andrew Piper, Joel Mire, Maarten Sap, Maria Antoniak, Steven R. Wilson, Zexin Ma.

**Figure 2.** Figure 2: We introduce SSF-TAXONOMY, a taxonomy of reader response to informal storytelling on social media. (Rashkin et al., 2018a; Vijayaraghavan and Roy, 2021), yet leave several aspects of narrative reception unexplored (e.g., perceived narrative intent, aesthetic feelings), and do not account for narrative or social contexts outside the storyworld itself, which are essential in a social media context. Recent… view at source ↗

**Figure 3.** Figure 3: Human Inference Plausibility Ratings. The rates of “very likely” or “somewhat likely” indicate that [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Subreddit similarity rankings according to [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Author-centric and reader-centric lenses onto [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Mean “Consistency” and “Relevance” context summary ratings for annotator1 ( [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

**Figure 7.** Figure 7: An example of a story, plus the available community and conversational context, presented to an annotator [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗

**Figure 8.** Figure 8: Examples of plausibility rating questions for the overall_goal, narrative_intent, and au [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗

**Figure 9.** Figure 9: An example of the dimension-specific questions asking annotators to substitute the placeholder for a [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗

**Figure 10.** Figure 10: Inter-rater Agreement using Jaccard Index for Taxonomy Classification. Values represent average Jaccard [PITH_FULL_IMAGE:figures/full_fig_p035_10.png] view at source ↗

**Figure 11.** Figure 11: How well ssf-sim and standard semantic similarity metrics recover human preferences for story similarity. The results from the small-scale annotation are reported in [PITH_FULL_IMAGE:figures/full_fig_p037_11.png] view at source ↗

**Figure 12.** Figure 12: Bar plots showing the SSF-TAXONOMY dimension-level sublabel distributions for SSF-CORPUS and SSF-STRATIFIED-CORPUS [PITH_FULL_IMAGE:figures/full_fig_p040_12.png] view at source ↗

**Figure 13.** Figure 13: Normalized pointwise mutual information (NPMI) between overall goals and narrative intents [PITH_FULL_IMAGE:figures/full_fig_p041_13.png] view at source ↗

**Figure 14.** Figure 14: Comparison between subreddits with the highest ( [PITH_FULL_IMAGE:figures/full_fig_p042_14.png] view at source ↗

read the original abstract

Reading stories evokes rich interpretive, affective, and evaluative responses, such as inferences about narrative intent or judgments about characters. Yet, computational models of reader response are limited, preventing nuanced analyses. To address this gap, we introduce SocialStoryFrames, a formalism for distilling plausible inferences about reader response, such as perceived author intent, explanatory and predictive reasoning, affective responses, and value judgments, using conversational context and a taxonomy grounded in narrative theory, linguistic pragmatics, and psychology. We develop two models, SSF-Generator and SSF-Classifier, validated through human surveys (N=382 participants) and expert annotations, respectively. We conduct pilot analyses to showcase the utility of the formalism for studying storytelling at scale. Specifically, applying our models to SSF-Corpus, a curated dataset of 6,140 social media stories from diverse contexts, we characterize the frequency and interdependence of storytelling intents, and we compare and contrast narrative practices (and their diversity) across communities. By linking fine-grained, context-sensitive modeling with a generic taxonomy of reader responses, SocialStoryFrames enable new research into storytelling in online communities.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SocialStoryFrames gives a workable formalism and corpus for reader-response modeling, but the validation evidence stays thin.

read the letter

The paper's core move is introducing SocialStoryFrames as a taxonomy and inference system for what readers pick up from online stories—author intent, explanations, predictions, feelings, and value judgments—then pairing it with a generator model, a classifier, and a 6,140-story corpus drawn from varied communities. That package is new; prior work on narrative modeling has not combined this exact pragmatic taxonomy with scalable generation and classification over social media text. The grounding in narrative theory and pragmatics is straightforward and helps avoid pure data-fitting. The pilot analyses that compare storytelling patterns across communities also show the formalism can surface concrete differences in intent frequency and diversity, which is the kind of output that could feed follow-on studies. The human survey (N=382) and expert annotations are presented as validation, and the corpus itself is a tangible resource. Those pieces give the work a usable starting point for computational social science or narrative AI groups. The soft spot is the validation itself. The abstract and stress-test note mention the survey and annotations but give no inter-annotator agreement numbers, no cross-validation details, and no held-out community tests. Without those, it is hard to tell how much the models are capturing stable reader inferences versus patterns specific to the surveyed group. That gap does not sink the formalism, but it does leave the central claim—that the outputs faithfully represent real reader responses—on preliminary footing. Readers working on reader-response modeling or online storytelling will find the taxonomy and corpus worth examining. The paper is coherent on its own terms and shows clear engagement with the literature, so it merits a serious referee who can press on the reliability metrics and generalization tests. I would bring it to a reading group for the formalism and data, but I would not cite the models yet without seeing stronger validation.

Referee Report

2 major / 1 minor

Summary. The paper introduces SocialStoryFrames, a formalism for distilling plausible inferences about reader responses to stories (perceived author intent, explanatory/predictive reasoning, affective responses, value judgments) grounded in narrative theory, linguistic pragmatics, and psychology. It presents SSF-Generator and SSF-Classifier models, validates them via a human survey (N=382) and expert annotations, and applies the models to the SSF-Corpus of 6,140 social media stories for pilot analyses of storytelling intent frequencies, interdependencies, and cross-community differences.

Significance. If the taxonomy and models are shown to faithfully capture real reader inferences, the work would provide a useful bridge between qualitative narrative studies and scalable computational analysis of online storytelling. The grounding in external theory (rather than self-referential fitting) and the application to a diverse corpus are strengths that could support new research on narrative reception. Current impact is limited by the preliminary validation.

major comments (2)

[Validation and model evaluation sections] The human validation (N=382) and expert annotation procedures are described without inter-annotator agreement metrics, full results, error analysis, or cross-validation details. This is load-bearing for the central claim that the SocialStoryFrames taxonomy and models capture real reader inferences rather than cohort-specific patterns.
[Corpus application and pilot analyses] No out-of-domain testing or held-out community evaluation is reported for the SSF-Generator/SSF-Classifier when applied to the 6,140-story SSF-Corpus. Without this, generalization beyond the surveyed participants cannot be assessed, undermining the pilot analyses of community differences.

minor comments (1)

[Abstract] The abstract refers to 'pilot analyses' without specifying concrete new findings on intent interdependence or community diversity beyond frequency counts.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which identifies key areas where additional rigor will strengthen the paper's claims about the SocialStoryFrames taxonomy and models. We address each major comment below and commit to revisions that incorporate the requested details while preserving the pilot nature of the corpus analyses.

read point-by-point responses

Referee: [Validation and model evaluation sections] The human validation (N=382) and expert annotation procedures are described without inter-annotator agreement metrics, full results, error analysis, or cross-validation details. This is load-bearing for the central claim that the SocialStoryFrames taxonomy and models capture real reader inferences rather than cohort-specific patterns.

Authors: We agree these metrics are essential. The original submission reported aggregate survey outcomes and expert annotation counts but omitted agreement statistics, full breakdowns, error analysis, and cross-validation. In the revised manuscript we will add Krippendorff's alpha (or equivalent) for the expert annotations, complete survey results with demographic and response distributions, a detailed error analysis contrasting model predictions against human judgments, and cross-validation performance from model training. These additions will directly support the claim that the taxonomy captures generalizable reader inferences. revision: yes
Referee: [Corpus application and pilot analyses] No out-of-domain testing or held-out community evaluation is reported for the SSF-Generator/SSF-Classifier when applied to the 6,140-story SSF-Corpus. Without this, generalization beyond the surveyed participants cannot be assessed, undermining the pilot analyses of community differences.

Authors: The SSF-Corpus analyses are presented explicitly as pilot explorations to illustrate the formalism's utility at scale, with primary validation residing in the human survey. We acknowledge that explicit out-of-domain or held-out community testing would better bound generalization. In the revision we will add a held-out evaluation (e.g., training on a random subset or community-stratified split of the corpus and reporting performance on the remainder) together with an explicit limitations paragraph on community-specific patterns. This will qualify the pilot results without overclaiming generalizability. revision: partial

Circularity Check

0 steps flagged

No circularity detected in derivation or validation chain

full rationale

The paper's central formalism is explicitly grounded in external narrative theory, linguistic pragmatics, and psychology rather than self-referential definitions or fits. SSF-Generator and SSF-Classifier are developed and validated via independent human surveys (N=382) and expert annotations before being applied to the separate SSF-Corpus; no equations, parameter fitting, or predictions reduce by construction to the inputs, and no load-bearing self-citations or uniqueness theorems are invoked. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the assumption that the theory-grounded taxonomy captures genuine reader responses and that the models faithfully implement it.

axioms (1)

domain assumption A taxonomy grounded in narrative theory, linguistic pragmatics, and psychology accurately represents plausible reader inferences.
Invoked when defining SocialStoryFrames in the abstract.

invented entities (1)

SocialStoryFrames no independent evidence
purpose: Formalism for distilling reader response inferences from stories
Newly introduced structured representation.

pith-pipeline@v0.9.0 · 5520 in / 1151 out tokens · 23373 ms · 2026-05-16T21:14:52.810078+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages

[1]

Topics in Cognitive Science, 11(4):710–732

Storytelling as adaptive collective sensemak- ing. Topics in Cognitive Science, 11(4):710–732. Prakhar Biyani, Cornelia Caragea, Prasenjit Mitra, and John Yen. 2014. Identifying emotional and informational support in online health commu- nities. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers...

work page 2014
[2]

In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4762–4779, Stroudsburg, PA, USA

COMET: Commonsense transformers for auto- matic knowledge graph construction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4762–4779, Stroudsburg, PA, USA. Association for Computa- tional Linguistics. G. Bouma. 2009. Normalized (pointwise) mutual infor- mation in collocation extraction. Brian Boyd, Paul...

work page arXiv 2009
[3]

sharing is caring

Modeling social readers: Novel tools for ad- dressing reception from online book reviews. arXiv [cs.CL]. Jenny Zhengye Hou. 2023. “sharing is caring”: Partici- patory storytelling and community building on social media amidst the COVID-19 pandemic. Am. Behav. Sci., page 000276422311640. Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi L...

work page 2023
[4]

In Handbook of Empirical Literary Studies, pages 279–

Narrative absorption: An overview. In Handbook of Empirical Literary Studies, pages 279–

work page
[5]

positively

De Gruyter. Moniek M Kuijpers, Frank Hakemulder, Ed S Tan, and Miruna M Doicaru. 2014. Exploring absorbing read- ing experiences: Developing and validating a self- report scale to measure story world absorption. Sci. Study Lit., 4(1):89–122. Andrew Leslie. 2015. How stories argue: The deep roots of storytelling in political rhetoric. Storytelling, Self, S...

work page arXiv 2014
[6]

kappa paradox

was low due to the “kappa paradox” (Fein- stein and Cicchetti, 1990). This paradox occurs when most ratings fall in the same category—here, the maximum score of 5—so chance agreement is overestimated and kappa is artificially deflated. To address this, we also report the Brennan–Prediger coefficient, κb (Brennan and Prediger, 1981), and Gwet’sAC2 (Gwet, 2...

work page 1990

[1] [1]

Topics in Cognitive Science, 11(4):710–732

Storytelling as adaptive collective sensemak- ing. Topics in Cognitive Science, 11(4):710–732. Prakhar Biyani, Cornelia Caragea, Prasenjit Mitra, and John Yen. 2014. Identifying emotional and informational support in online health commu- nities. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers...

work page 2014

[2] [2]

In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4762–4779, Stroudsburg, PA, USA

COMET: Commonsense transformers for auto- matic knowledge graph construction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4762–4779, Stroudsburg, PA, USA. Association for Computa- tional Linguistics. G. Bouma. 2009. Normalized (pointwise) mutual infor- mation in collocation extraction. Brian Boyd, Paul...

work page arXiv 2009

[3] [3]

sharing is caring

Modeling social readers: Novel tools for ad- dressing reception from online book reviews. arXiv [cs.CL]. Jenny Zhengye Hou. 2023. “sharing is caring”: Partici- patory storytelling and community building on social media amidst the COVID-19 pandemic. Am. Behav. Sci., page 000276422311640. Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi L...

work page 2023

[4] [4]

In Handbook of Empirical Literary Studies, pages 279–

Narrative absorption: An overview. In Handbook of Empirical Literary Studies, pages 279–

work page

[5] [5]

positively

De Gruyter. Moniek M Kuijpers, Frank Hakemulder, Ed S Tan, and Miruna M Doicaru. 2014. Exploring absorbing read- ing experiences: Developing and validating a self- report scale to measure story world absorption. Sci. Study Lit., 4(1):89–122. Andrew Leslie. 2015. How stories argue: The deep roots of storytelling in political rhetoric. Storytelling, Self, S...

work page arXiv 2014

[6] [6]

kappa paradox

was low due to the “kappa paradox” (Fein- stein and Cicchetti, 1990). This paradox occurs when most ratings fall in the same category—here, the maximum score of 5—so chance agreement is overestimated and kappa is artificially deflated. To address this, we also report the Brennan–Prediger coefficient, κb (Brennan and Prediger, 1981), and Gwet’sAC2 (Gwet, 2...

work page 1990