pith. sign in

arxiv: 2604.10368 · v1 · submitted 2026-04-11 · 💻 cs.CL

A Structured Clustering Approach for Inducing Media Narratives

Pith reviewed 2026-05-10 15:15 UTC · model grok-4.3

classification 💻 cs.CL
keywords media narrativesstructured clusteringnarrative schemasframing theoryevent and character modelingcomputational journalismscalable text analysis
0
0 comments X

The pith

A structured clustering approach induces explainable narrative schemas from media texts by jointly grouping events and characters, matching framing theory at large scale without manual labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a computational framework to automatically extract rich storytelling structures from news and media. It does so by clustering events together with the characters who participate in them, rather than treating those elements separately. The resulting schemas are designed to be human-interpretable and to match the kinds of narrative patterns that communication researchers study under framing theory. Because the process requires no exhaustive hand-labeling or domain-specific taxonomies, it can handle very large collections of texts. If the method works as claimed, analysts could track how media organizations construct meaning across thousands of stories without the usual bottlenecks of manual coding.

Core claim

Jointly clustering events and characters produces narrative schemas that are both explainable and consistent with established framing theory, allowing the approach to scale to large corpora without exhaustive manual annotation.

What carries the argument

Structured clustering that jointly models events and characters to form narrative schemas.

If this is right

  • Large media collections can be analyzed for narrative patterns without creating domain-specific taxonomies or exhaustive manual labels.
  • The induced schemas remain interpretable and align with framing theory from communication research.
  • Researchers gain a scalable way to study how media shapes public opinion through storytelling structures.
  • Existing coarse-grained or taxonomy-bound methods can be replaced by this joint event-character modeling for finer-grained analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same joint-clustering logic could be tested on social-media threads or historical archives to see whether narrative schemas transfer across genres.
  • Downstream applications such as automated bias detection or story summarization might use the schemas as intermediate representations.
  • If the schemas prove robust, they could serve as a bridge between computational text analysis and qualitative media studies.

Load-bearing premise

That jointly modeling events and characters through structured clustering will capture the nuanced storytelling structures that communication theory treats as central to meaning construction.

What would settle it

Human experts annotate a held-out set of media texts for framing elements and narrative roles; if the automatically induced schemas show low agreement with those annotations or yield no new scalable insights on the full corpus, the central claim fails.

Figures

Figures reproduced from arXiv: 2604.10368 by Advait Deshmukh, Alexandria Leto, I-Ta Lee, Maria Leonor Pacheco, Rohan Das, Zohar Naaman.

Figure 1
Figure 1. Figure 1: Law enforcement spending framed as worker [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: For a large scale news corpus, we first construct narrative event chains and obtain character and role [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: SHAP feature importance comparison between RoBERTa embeddings and narrative features for frame [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: SHAP feature importance values ( [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Crime and Punishment frame in gun control news articles. Full length narrative schemas are listed in App. F.4.1. tional rights supporters, shaping enforcement effectiveness. 2. The conflict between self-defense rights and strict gun control laws leads to legal conse￾quences for homeowners, prompt… view at source ↗
Figure 7
Figure 7. Figure 7: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Economic Identity frame in immigration news articles. Full length narrative schemas are listed in App. F.4.3. 3. The increase in immigration service fees cre￾ates significant financial barriers for immi￾grants seeking citizenship and document ser￾vices. 4. Increasing immigration quotas addresses … view at source ↗
Figure 8
Figure 8. Figure 8: Top 10 most important features for frame [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
read the original abstract

Media narratives wield tremendous power in shaping public opinion, yet computational approaches struggle to capture the nuanced storytelling structures that communication theory emphasizes as central to how meaning is constructed. Existing approaches either miss subtle narrative patterns through coarse-grained analysis or require domain-specific taxonomies that limit scalability. To bridge this gap, we present a framework for inducing rich narrative schemas by jointly modeling events and characters via structured clustering. Our approach produces explainable narrative schemas that align with established framing theory while scaling to large corpora without exhaustive manual annotation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents a framework for inducing rich narrative schemas from media texts by jointly modeling events and characters via structured clustering. It claims this produces explainable schemas aligned with framing theory while scaling to large corpora without exhaustive manual annotation.

Significance. If the central claims hold with supporting evidence, the work could meaningfully advance computational narrative analysis by offering a scalable, interpretable bridge to communication theory, addressing limitations of both coarse-grained NLP methods and domain-specific manual approaches.

major comments (2)
  1. Method section: The structured clustering approach is described only at a high level with no specification of the objective function, similarity metrics, or how events and characters are jointly represented and clustered. This directly bears on the central claim that the method captures nuanced storytelling structures, as the joint modeling mechanism remains unspecified and untestable.
  2. Evaluation section: No quantitative metrics, qualitative case studies, or mappings to specific framing theory constructs (e.g., conflict, human-interest, or episodic/thematic frames) are provided to substantiate the claim of alignment with established framing theory. Without such evidence, the alignment assertion cannot be assessed and is load-bearing for the paper's contribution.
minor comments (1)
  1. Abstract: While concise, it would benefit from a one-sentence summary of the datasets or corpora used to ground the scalability claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. The feedback identifies key areas where additional detail and evidence will strengthen the presentation of our structured clustering framework for inducing media narratives. We address each major comment below and commit to incorporating the necessary expansions in the revised version.

read point-by-point responses
  1. Referee: Method section: The structured clustering approach is described only at a high level with no specification of the objective function, similarity metrics, or how events and characters are jointly represented and clustered. This directly bears on the central claim that the method captures nuanced storytelling structures, as the joint modeling mechanism remains unspecified and untestable.

    Authors: We agree that the current method section provides only a high-level description and lacks the requested technical specifications. In the revised manuscript, we will expand this section to explicitly define the objective function (a joint optimization over event and character embeddings with structural regularization), the similarity metrics (a combination of embedding cosine similarity and graph-based structural distances), and the joint representation process (events and characters embedded in a shared latent space via a bipartite graph model prior to clustering). These additions will make the joint modeling mechanism fully specified, testable, and replicable while preserving the original claims. revision: yes

  2. Referee: Evaluation section: No quantitative metrics, qualitative case studies, or mappings to specific framing theory constructs (e.g., conflict, human-interest, or episodic/thematic frames) are provided to substantiate the claim of alignment with established framing theory. Without such evidence, the alignment assertion cannot be assessed and is load-bearing for the paper's contribution.

    Authors: We acknowledge that the evaluation section in the submitted manuscript does not include the quantitative metrics, case studies, or explicit mappings to framing theory constructs. In the revision, we will add quantitative evaluations using metrics such as cluster coherence scores and schema induction precision, along with qualitative case studies drawn from the corpus. We will also provide direct mappings of induced schemas to framing constructs (e.g., linking specific event-character clusters to conflict or human-interest frames). These changes will supply the evidence required to substantiate the alignment claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces a novel framework for inducing narrative schemas via joint structured clustering of events and characters. The central claim describes a methodological contribution that produces explainable schemas aligned with framing theory and scales without manual annotation or domain taxonomies. No equations, derivations, or load-bearing steps in the abstract reduce by construction to fitted parameters, self-definitions, or self-citation chains. The approach is positioned as an independent alternative to existing coarse-grained or taxonomy-limited methods, making the derivation chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no details on free parameters, axioms, or invented entities; all arrays left empty due to lack of information.

pith-pipeline@v0.9.0 · 5385 in / 974 out tokens · 69016 ms · 2026-05-10T15:15:57.304394+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 2 internal anchors

  1. [1]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    “let your characters tell their story”: A dataset for character-centric narrative understanding. In Findings of the Association for Computational Lin- guistics: EMNLP 2021, pages 1734–1752, Punta Cana, Dominican Republic. Association for Compu- tational Linguistics. Dallas Card, Amber E. Boydstun, Justin H. Gross, Philip Resnik, and Noah A. Smith. 2015. T...

  2. [2]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Weakly-supervised modeling of contextual- ized event embedding for discourse relations. In Findings of the Association for Computational Lin- guistics: EMNLP 2020, pages 4962–4972, Online. Association for Computational Linguistics. Alexandria Leto, Elliot Pickens, Coen Needell, David Rothschild, and Maria Leonor Pacheco. 2024. Fram- ing in the presence of...

  3. [3]

    InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2219–2263, Online

    Modeling framing in immigration discourse on social media. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2219–2263, Online. Association for Computational Linguistics. Nona Naderi and Graeme Hirst. 2017. Classifying frames at the sentence level in news...

  4. [4]

    InProceedings of the 2021 Conference on Em- pirical Methods in Natural Language Processing, pages 5427–5440, Online and Punta Cana, Domini- can Republic

    Corpus-based open-domain event type induc- tion. InProceedings of the 2021 Conference on Em- pirical Methods in Natural Language Processing, pages 5427–5440, Online and Punta Cana, Domini- can Republic. Association for Computational Lin- guistics. Dominik Stammbach, Maria Antoniak, and Elliott Ash

  5. [5]

    InProceedings of the 4th Workshop of Narrative Understanding (WNU2022), pages 47–56, Seattle, United States

    Heroes, villains, and victims, and GPT-3: Au- tomated extraction of character roles without train- ing data. InProceedings of the 4th Workshop of Narrative Understanding (WNU2022), pages 47–56, Seattle, United States. Association for Computational Linguistics. 12 Xiaozhi Wang, Yulin Chen, Ning Ding, Hao Peng, Zimu Wang, Yankai Lin, Xu Han, Lei Hou, Juanzi...

  6. [6]

    Pro” or “Anti

    Media attitude detection via framing analysis with events and their relations. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17197–17210, Miami, Florida, USA. Association for Computational Lin- guistics. Caleb Ziems and Diyi Yang. 2021. To protect and to serve? analyzing entity-centric framing of police vi...

  7. [7]

    We trained the model for a maximum of 150 epochs and calibrated based on performance on the development set

    The model was trained using Adam optimizer with a learning rate of 1e−6 , linear learning rate scheduling with warmup (20% of the first epoch) and a weight decay of 1e−2 . We trained the model for a maximum of 150 epochs and calibrated based on performance on the development set. Silver labels improved performance of the system by ≈ 24 points over trainin...

  8. [8]

    Issue: A schema contains this frame element if it references the core debate in the given domain and its causal agents, either implicitly or explicitly

  9. [9]

    The means through which causal agents achieve their effects are generally in- cluded within the scope of this element

    Evaluation: A schema employs this element if it includes discussion of causal agents and their effects. The means through which causal agents achieve their effects are generally in- cluded within the scope of this element

  10. [10]

    We find high rates of theevaluationframe el- ement in schemas

    Resolution: The schema contains specific sug- gestions proposed as solutions, separate from the causal interpretation of evaluation. We find high rates of theevaluationframe el- ement in schemas. This prominent discussion of cause and effects aligns with the fact that the schemas are generated based on clusters of causal event chain verbalizations; it is ...

  11. [11]

    Immigrants face significant challenges access- ing healthcare due to restrictive policies and societal fears, with varying government and advocacy responses

  12. [12]

    Icould give you a litany of laws bro-ken or lines crossed. Laws some-times don’t protect the innocent,tragically enough

    The failure of U.S. immigration policies and governmental actions leads to significant hard- 18 ErrorType Article Excerpt Causal EventPair Verbalization Character Annotation Assigned Schema Analysis Incorrectevent ex-traction Sen. Larry Craig, R-Idaho, saidDemocrats would turn off voters ifthey used the Virginia Tech tragedyto roll back gun owners’ rights...

  13. [13]

    Immigrants facing exploitation and significant challenges receive support from advocates who push for solutions like legal status

  14. [14]

    Immigrants face significant challenges access- ing essential services due to language barriers, despite some positive efforts by government and advocates

  15. [15]

    Immigrants face significant challenges in achieving necessary English proficiency, which affects their ability to integrate into so- ciety and access opportunities

  16. [16]

    government’s aggressive enforce- ment of deportation policies targets im- migrants viewed as threats, leading to widespread arrests and expulsions

    The U.S. government’s aggressive enforce- ment of deportation policies targets im- migrants viewed as threats, leading to widespread arrests and expulsions

  17. [17]

    F.3 Narrative Schemas for Frame Level Analysis The complete LLM-generated narrative schemas corresponding to Fig

    Illegal immigration is causing significant prob- lems in the U.S., such as job competition and increased crime, necessitating stronger border control and legal measures. F.3 Narrative Schemas for Frame Level Analysis The complete LLM-generated narrative schemas corresponding to Fig. 4 are presented below in decreasing orderof feature importance

  18. [18]

    Full length narrative schemas are listed in App

    Judicial decisions on gun laws spark conflicts between public safety advocates and constitu- 19 Figure 5: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Crime and Punishment frame in gun control news articles. Full length narrative schemas are listed in App. F.4.1. tional rights supporters, ...

  19. [19]

    The conflict between self-defense rights and strict gun control laws leads to legal conse- quences for homeowners, prompting legisla- tive changes to protect those using firearms in self-defense

  20. [20]

    The judiciary plays a central role in shaping gun control through legal rulings that balance public safety with constitutional rights

  21. [21]

    The right to bear arms is upheld as a funda- mental freedom, yet its exercise leads to tragic consequences and public safety concerns

  22. [22]

    Legal battles over gun manufacturer liability drive efforts by advocates and governments to hold companies accountable, while facing op- position from gun rights supporters and politi- cians aiming to protect the industry

  23. [23]

    F.4 Additional Frame Level Analysis F.4.1 Crime and Punishment - Gun Control The complete LLM-generated narrative schemas corresponding to Fig

    The conflict between public safety concerns and Second Amendment rights drives debates over gun control measures. F.4 Additional Frame Level Analysis F.4.1 Crime and Punishment - Gun Control The complete LLM-generated narrative schemas corresponding to Fig. 5 are presented below in decreasing orderof feature importance

  24. [26]

    The ineffective enforcement of existing gun control laws leads to continued criminal ac- cess to firearms, necessitating stricter regula- tions and enhanced oversight

  25. [27]

    Strict enforcement of mandatory prison sen- tences for illegal firearm possession and sales effectively reduces gun-related crimes through law enforcement and political support

  26. [28]

    Stricter penalties for felons possessing or us- ing firearms effectively reduce gun crimes through mandatory prison terms and deter- rence

  27. [29]

    F.4.2 Cultural Identity - Immigration The complete LLM-generated narrative schemas corresponding to Fig

    The consequences of gun possession and laws lead to interactions with law enforcement and varying outcomes. F.4.2 Cultural Identity - Immigration The complete LLM-generated narrative schemas corresponding to Fig. 6 are presented below in decreasing orderof feature importance

  28. [30]

    Immigrants and advocates champion their rights and dignity against anti-immigrant poli- cies, emphasizing human rights and contribu- tions to society while advocating for reform

  29. [31]

    The rapid growth of immigrant populations is causing significant demographic changes and straining local social services

  30. [32]

    20 Figure 6: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Cultural Identity frame in immigration news articles

    Immigrants, supported by heroic advocates like Sister Jean Marshall, overcome integra- tion challenges through cultural preservation and language education. 20 Figure 6: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Cultural Identity frame in immigration news articles. Full length narrative...

  31. [34]

    Politicians and advocates push for laws pro- viding a path to citizenship for undocumented immigrants while enforcing border security to address illegal immigration fairly

  32. [35]

    F.4.3 Economic - Immigration The complete LLM-generated narrative schemas corresponding to Fig

    Immigrants positively contribute to the econ- omy through work, job creation, and en- trepreneurship, supported by advocates and policies. F.4.3 Economic - Immigration The complete LLM-generated narrative schemas corresponding to Fig. 7 are presented below in decreasing orderof feature importance

  33. [36]

    The high economic costs of illegal immigra- tion impose a significant fiscal burden on fed- eral and state governments, necessitating pol- icy changes to reduce expenses

  34. [37]

    Figure 7: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Economic Identity frame in immigration news articles

    Immigrants positively contribute to the econ- omy through work, job creation, and en- trepreneurship, supported by advocates and policies. Figure 7: SHAP feature importance values (×1000) for the top 6 narrative schemas and top 2 role features that predict the Economic Identity frame in immigration news articles. Full length narrative schemas are listed i...

  35. [38]

    The increase in immigration service fees cre- ates significant financial barriers for immi- grants seeking citizenship and document ser- vices

  36. [39]

    Increasing immigration quotas addresses la- bor shortages and supports economic stability through policymakers’ actions

  37. [40]

    Stricter immigration enforcement and labor shortages in agriculture prompt calls for re- form to stabilize the workforce

  38. [41]

    job market by sometimes displacing lo- cal workers while also preventing overseas job relocation

    The hiring of foreign workers impacts the U.S. job market by sometimes displacing lo- cal workers while also preventing overseas job relocation. F.5 Domain Level Analysis F.5.1 Immigration The complete LLM-generated narrative schemas corresponding to Fig. 8 are presented below in decreasing orderof feature importance

  39. [42]

    Immigrants and advocates champion their rights and dignity against anti-immigrant poli- cies, emphasizing human rights and contribu- tions to society while advocating for reform. 21 Figure 8: Top 10 most important features for frame prediction in the immigration domain, showing SHAP importance values (×1000) for narrative schemas (blue) and character port...

  40. [43]

    The contentious political debate over immigra- tion reform highlights legislative conflicts and compromised resolutions between Democrats and Republicans

  41. [44]

    government and law enforcement are combating illegal immigration and human trafficking, which victimize immigrants and pose threats to national security

    The U.S. government and law enforcement are combating illegal immigration and human trafficking, which victimize immigrants and pose threats to national security

  42. [45]

    Immigrants face challenges in obtaining le- gal status while advocates and policymakers implement measures to provide pathways for residency and integration

  43. [46]

    Fraudulent immigration activities exploit vul- nerable individuals, prompting legal actions and advocacy efforts to combat exploitation

  44. [47]

    Politicians are leading efforts to reform im- migration laws, providing pathways to legal status and citizenship for undocumented im- migrants

  45. [48]

    Politicians collaborating on immigration re- form to address illegal immigration through policies like temporary-worker programs and legal status. Figure 9: Top 10 most important features for frame prediction in the immigration domain, showing SHAP importance values (×1000) for narrative schemas (blue) and character portrayals (red). Features are ranked b...

  46. [49]

    Judicial decisions on gun laws spark conflicts between public safety advocates and constitu- tional rights supporters, shaping enforcement effectiveness

  47. [50]

    Stricter gun control laws and enforcement efforts aim to address loopholes and illegal firearm trafficking, which pose significant pub- lic safety risks

  48. [51]

    Politicians’ efforts to pass assault weapons bans address the issue of reducing criminal gun violence through legislative action despite strong opposition from gun rights advocates

  49. [52]

    The debate over gun control highlights a cen- tral conflict between efforts to reduce violence through regulation and arguments that more firearms enhance public safety

  50. [53]

    Politicians face challenges and pressures from gun rights advocates while attempting to pass 22 or block gun control legislation, with some actions leading to significant defeats for gun control advocates

  51. [54]

    Judicial decisions and legal challenges shape the trajectory of gun control laws through court rulings and legislative actions

  52. [55]

    Illegal gun trafficking, facilitated by weak regulations and loopholes, leads to increased crime, prompting law enforcement actions and calls for stricter legislation

  53. [56]

    The impact of gun availability on public safety is highlighted through incidents of self- defense and tragic shootings, illustrating both protection and loss

  54. [57]

    Government and political leaders are proac- tively advancing gun control measures through legislative actions and policy sign- ings

  55. [58]

    because",

    The judiciary plays a central role in shaping gun control through legal rulings that balance public safety with constitutional rights. G LLM Hyperparameters We used the following hyperparamters for all DeepSeek-R1-Distill-Llama-70B inference tasks: random seed = 42, temperature = 0 (for determinis- tic outputs), and context length = 14336. H Compute Resou...

  56. [59]

    Identify which character groups appear in a given event chain (a sentence describing causal relationships)

  57. [60]

    Classify each character’s role as Hero, Victim, Threat, or Neutral based on their portrayal

  58. [61]

    characters

    Determine if the event chain’s framing indicates a pro, anti, or neutral stance toward the domain topic ## Critical Constraint IMPORTANT: Analyze ONLY characters and entities explicitly mentioned in the event chain. Ignore all other characters and entities in the article. The full article is for context only. ## Annotation Guidelines ### Character Identif...

  59. [62]

    Analyze the provided sentences to identify the central narrative theme

  60. [63]

    Consider how the character roles interact within this narrative framework

  61. [64]

    **Conservative Framing Assessment**: Only after constructing your theme, examine whether the cluster exhibits **clear and explicit** Entman’s framing elements: - **Issue/Problem**: A specific, well-defined problem that is explicitly articulated (not just implied) - **Evaluation**: Clear moral judgments or strong value assessments about causes, effects, or...

  62. [65]

    Subtle implications, vague suggestions, or weak patterns should NOT trigger framing detection

    **Important**: Only detect framing patterns when these elements are **prominently and explicitly present** in the cluster sentences. Subtle implications, vague suggestions, or weak patterns should NOT trigger framing detection

  63. [66]

    **Theme Construction**: - Generate ONE concise sentence (15-25 words) that captures the overarching narrative theme - Include specific events, actions, or circumstances that recur across the sentences - Focus on relationship dynamics, conflict, causation, or resolution patterns - If clear framing patterns exist, ensure the theme incorporates the central i...

  64. [67]

    Do not paraphrase or create new text - only use exact portions of the theme sentence

    **Framing Elements Identification**: If framing patterns are detected, identify the exact substrings from your generated theme sentence that correspond to each framing element. Do not paraphrase or create new text - only use exact portions of the theme sentence. **Overlapping substrings between issue, evaluation, and resolution are permitted when necessar...

  65. [68]

    Ensure your narrative summary reflects the specific domain context provided

  66. [69]

    theme":

    Output your response as valid JSON in the specified format ## Input Format Each analysis task will include: - **DOMAIN**: The topic area (Gun Control or Immigration) - **CHARACTER ROLES**: List of possible character roles that may appear in the cluster - **CLUSTER SENTENCES**: A collection of sentences representing the narrative cluster along with their c...