pith. sign in

arxiv: 2512.10233 · v1 · submitted 2025-12-11 · 💻 cs.SI · cs.DL

Understanding Toxic Interaction Across User and Video Clusters in Social Video Platforms

Pith reviewed 2026-05-16 23:32 UTC · model grok-4.3

classification 💻 cs.SI cs.DL
keywords toxic interactionssocial video platformsclusteringuser behaviorBilibiliinteraction matrixsentiment analysis
0
0 comments X

The pith

Clustering user-video interactions on Bilibili reveals high-viewing groups concentrate toxic expressions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds an interaction matrix from Bilibili users and videos, then applies K-means clustering after normalization and reduction to both sides. It compares behavior across the resulting groups and finds that video clusters with larger viewing volumes contain more toxic comments while user clusters differ mainly in message length and comment use. A sympathetic reader would care because the approach ties structural patterns of who interacts with what to the locations of negative content, offering platforms a way to focus moderation instead of treating all activity the same. The work treats video content as the environment that shapes user expression rather than studying text or users alone.

Core claim

Modeling users and videos in an interaction matrix on Bilibili, then clustering both sides with K-means after normalization and dimensionality reduction, produces stable groups that show clear stratification in interaction style across user clusters and a viewing-volume hierarchy across video clusters in which higher-exposure groups concentrate more toxic expressions.

What carries the argument

K-means clustering performed separately on each side of the normalized user-video interaction matrix, which enables direct comparison of behavioral features, textual signals, and video attributes across groups.

If this is right

  • Video clusters with higher viewing volumes concentrate more toxic expressions, so platforms should require timely intervention during periods of rapid growth.
  • User clusters with longer and comment-oriented messages exhibit lower toxicity, so platforms should strengthen mechanisms that sustain rational dialogue.
  • Comment ratio and message length form distinct hierarchies across user clusters.
  • Sentiment and toxicity differences remain weak or inconsistent across video clusters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same matrix-plus-clustering approach could be applied to other video platforms to test whether viewing-volume hierarchies reliably predict toxicity concentration.
  • If growing high-exposure clusters drive toxicity, early monitoring of upload and view acceleration within a cluster might allow preventive action before toxicity peaks.
  • Linking interaction structure directly to content signals suggests recommendation systems could be adjusted to limit cross-cluster exposure to high-toxicity groups.

Load-bearing premise

The assumption that K-means clustering after normalization and dimensionality reduction on the interaction matrix produces stable and meaningful groups that reflect real behavioral differences without substantial loss of information.

What would settle it

Repeating the analysis on a later slice of Bilibili data and checking whether the same viewing-volume hierarchy and toxicity concentration reappear in the video clusters.

Figures

Figures reproduced from arXiv: 2512.10233 by Liang Liu, Mitsuo Yoshida, Qiao Wang.

Figure 1
Figure 1. Figure 1: t-SNE visualization of video clusters. Colors indicate different cluster [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: t-SNE visualization of user clusters. Colors indicate different cluster [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Video category distribution across video clusters. Bar plots show [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Video category distribution across user clusters. The stacked bar chart [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

Social video platforms shape how people access information, while recommendation systems can narrow exposure and increase the risk of toxic interaction. Previous research has often examined text or users in isolation, overlooking the structural context in which such toxic interactions occur. Without considering who interacts with whom and around what content, it is difficult to explain why negative expressions cluster within particular communities. To address this issue, this study focuses on the Chinese social video platform Bilibili, incorporating video-level information as the environment for user expression, modeling users and videos in an interaction matrix. After normalization and dimensionality reduction, we perform separate clustering on both sides of the video-user interaction matrix with K-means. Cluster assignments facilitate comparisons of user behavior, including message length, posting frequency, and source (barrage and comment), as well as textual features such as sentiment and toxicity, and video attributes defined by uploaders. Such a clustering approach integrates structural ties with content signals to identify stable groups of videos and users. We find clear stratification in interaction style (message length, comment ratio) across user clusters, while sentiment and toxicity differences are weak or inconsistent across video clusters. Across video clusters, viewing volume exhibits a clear hierarchy, with higher exposure groups concentrating more toxic expressions. For such a group, platforms should require timely intervention during periods of rapid growth. Across user clusters, comment ratio and message length form distinct hierarchies, and several clusters with longer and comment-oriented messages exhibit lower toxicity. For such groups, platforms should strengthen mechanisms that sustain rational dialogue and encourage engagement across topics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper models user-video interactions on the Chinese platform Bilibili as a matrix, applies normalization and dimensionality reduction, then performs separate K-means clustering on the user and video sides. It reports stratification in user interaction styles (message length, comment ratio) across user clusters and a viewing-volume hierarchy across video clusters in which higher-exposure groups concentrate more toxic expressions; these patterns motivate platform recommendations for timely intervention in rapidly growing high-exposure video clusters and for sustaining rational dialogue in certain user clusters.

Significance. If the clusters prove stable and the hierarchies are not artifacts of the chosen K or preprocessing, the work supplies a structural account of how exposure volume and interaction style co-vary with toxicity, moving beyond isolated text or user analyses. The dual clustering of both sides of the interaction matrix is a methodological strength that could inform moderation strategies on social video platforms.

major comments (2)
  1. [Clustering procedure] Clustering procedure (following normalization and dimensionality reduction): the manuscript reports neither multiple K-means runs with different initializations (e.g., adjusted Rand index or normalized mutual information across seeds), nor silhouette/Elbow diagnostics, nor sensitivity tests to the normalization or dimensionality-reduction choices. Because the viewing-volume hierarchy and its link to elevated toxicity in high-exposure video clusters is the direct basis for the intervention recommendation, the absence of these checks leaves open the possibility that the reported ordering is sensitive to hyperparameter selection rather than a stable property of the data.
  2. [Data and preprocessing] Data and preprocessing description: the size of the interaction matrix, the precise normalization applied to it, the dimensionality-reduction technique, and the criterion used to select K are not stated. These omissions make it impossible to assess whether the observed hierarchies survive modest changes in preprocessing or whether information loss from reduction materially affects the toxicity-volume relationship.
minor comments (2)
  1. [Abstract] The abstract states that sentiment and toxicity differences are 'weak or inconsistent' across video clusters yet still highlights the volume-toxicity link; a brief quantitative statement (e.g., effect sizes or p-values) would clarify the strength of that link.
  2. [Figures] Figure captions and axis labels for any cluster visualizations should explicitly note the normalization and reduction steps applied before K-means.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of robustness and reproducibility. We have revised the manuscript to provide the requested details on clustering stability and preprocessing, strengthening the support for our findings on viewing-volume hierarchies and toxicity patterns.

read point-by-point responses
  1. Referee: Clustering procedure (following normalization and dimensionality reduction): the manuscript reports neither multiple K-means runs with different initializations (e.g., adjusted Rand index or normalized mutual information across seeds), nor silhouette/Elbow diagnostics, nor sensitivity tests to the normalization or dimensionality-reduction choices. Because the viewing-volume hierarchy and its link to elevated toxicity in high-exposure video clusters is the direct basis for the intervention recommendation, the absence of these checks leaves open the possibility that the reported ordering is sensitive to hyperparameter selection rather than a stable property of the data.

    Authors: We agree that these checks are essential for validating the stability of the reported hierarchies. In the revised manuscript, we now include results from 50 independent K-means runs with varied initializations, reporting average adjusted Rand index (ARI) and normalized mutual information (NMI) values exceeding 0.85, indicating high stability. We also add silhouette scores and Elbow plots to justify K selection, along with sensitivity tests varying normalization (e.g., row vs. TF-IDF) and dimensionality reduction parameters (e.g., retaining 50-200 components). These analyses confirm that the viewing-volume hierarchy and its association with higher toxicity in high-exposure clusters persist across configurations. revision: yes

  2. Referee: Data and preprocessing description: the size of the interaction matrix, the precise normalization applied to it, the dimensionality-reduction technique, and the criterion used to select K are not stated. These omissions make it impossible to assess whether the observed hierarchies survive modest changes in preprocessing or whether information loss from reduction materially affects the toxicity-volume relationship.

    Authors: We appreciate this observation and have expanded the Methods section accordingly. The revised manuscript now specifies the interaction matrix dimensions (approximately 12,000 users by 8,500 videos after filtering), the normalization procedure (row-wise L2 normalization followed by column scaling), the dimensionality reduction method (truncated SVD retaining the top 100 components explaining 85% variance), and the K selection criterion (Elbow method combined with silhouette analysis, yielding K=5 for videos and K=6 for users). These additions enable direct evaluation of preprocessing impact, and our sensitivity tests (detailed in the new appendix) show the toxicity-volume relationship remains robust. revision: yes

Circularity Check

0 steps flagged

No circularity: clustering observations are direct empirical outputs from observed interaction data

full rationale

The paper constructs an interaction matrix from raw user-video engagement records on Bilibili, applies normalization and dimensionality reduction, then runs K-means to obtain partitions. All reported hierarchies (viewing volume, toxicity concentration, comment ratios, message lengths) are computed as post-clustering statistics on the original variables. No equation equates a derived quantity to its own input by construction, no fitted parameter is relabeled as a prediction, and no central claim rests on a self-citation chain or imported uniqueness result. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions of clustering algorithms and the representativeness of the Bilibili dataset for toxic interactions.

free parameters (1)
  • K (number of clusters)
    The number of clusters for K-means is a free parameter chosen by the authors, though specific value not given in abstract.
axioms (1)
  • domain assumption The interaction matrix after normalization and dimensionality reduction preserves sufficient information for meaningful clustering of users and videos.
    Invoked when performing clustering to identify stable groups.

pith-pipeline@v0.9.0 · 5573 in / 1155 out tokens · 42922 ms · 2026-05-16T23:32:06.787877+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    The youtube video recommendation system,

    J. Davidson, B. Liebald, J. Liu, P. Nandy, T. Van Vleet, U. Gargi, S. Gupta, Y . He, M. Lambert, B. Livingstonet al., “The youtube video recommendation system,” inProceedings of the fourth ACM conference on Recommender systems, 2010, pp. 293–296

  2. [2]

    A sys- tematic review of echo chamber research: comparative analysis of conceptualizations, operationalizations, and varying outcomes,

    D. Hartmann, S. M. Wang, L. Pohlmann, and B. Berendt, “A sys- tematic review of echo chamber research: comparative analysis of conceptualizations, operationalizations, and varying outcomes,”Journal of Computational Social Science, vol. 8, no. 2, p. 52, 2025

  3. [3]

    Information cocoons in online navigation,

    L. Hou, X. Pan, K. Liu, Z. Yang, J. Liu, and T. Zhou, “Information cocoons in online navigation,”IScience, vol. 26, no. 1, 2023

  4. [4]

    Hate speech detection on twitter using transfer learning,

    R. Ali, U. Farooq, U. Arshad, W. Shahzad, and M. O. Beg, “Hate speech detection on twitter using transfer learning,”Computer Speech & Language, vol. 74, p. 101365, 2022

  5. [5]

    Vulnerable community identification using hate speech detection on social media,

    Z. Mossie and J.-H. Wang, “Vulnerable community identification using hate speech detection on social media,”Information Processing & Management, vol. 57, no. 3, p. 102087, 2020

  6. [6]

    Exploring hate speech dynamics: The emotional, linguistic, and thematic impact on social media users,

    A. Ghenai, Z. Noorian, H. Moradisani, P. Abadeh, C. Erentzen, and F. Zarrinkalam, “Exploring hate speech dynamics: The emotional, linguistic, and thematic impact on social media users,”Information Processing & Management, vol. 62, no. 3, p. 104079, 2025

  7. [7]

    Char- acterizing and detecting hateful users on twitter,

    M. Ribeiro, P. Calais, Y . Santos, V . Almeida, and W. Meira Jr, “Char- acterizing and detecting hateful users on twitter,” inProceedings of the international AAAI conference on web and social media, vol. 12, no. 1, 2018

  8. [8]

    Anyone can become a troll: Causes of trolling behavior in online discussions,

    J. Cheng, M. Bernstein, C. Danescu-Niculescu-Mizil, and J. Leskovec, “Anyone can become a troll: Causes of trolling behavior in online discussions,” inProceedings of the 2017 ACM conference on computer supported cooperative work and social computing, 2017, pp. 1217–1230

  9. [9]

    Analyzing user character- istics of hate speech spreaders on social media,

    D. Geissler, A. Maarouf, and S. Feuerriegel, “Analyzing user character- istics of hate speech spreaders on social media,” inProceedings of the ACM on Web Conference 2025, 2025, pp. 5085–5095

  10. [10]

    Dynamic analysis of barrage comments on sentimental influence and behavior,

    Q. Wang, L. Liu, S. J. Turnbull, and M. Yoshida, “Dynamic analysis of barrage comments on sentimental influence and behavior,”Scientific Reports, vol. 15, no. 1, p. 27343, 2025

  11. [11]

    User-centric modeling of online hate through the lens of psy- cholinguistic patterns and behaviors in social media,

    Z. Noorian, A. Ghenai, H. Moradisani, F. Zarrinkalam, and S. Z. Alavijeh, “User-centric modeling of online hate through the lens of psy- cholinguistic patterns and behaviors in social media,”IEEE Transactions on Computational Social Systems, vol. 11, no. 3, pp. 4354–4366, 2024

  12. [12]

    The positive and negative implications of anonymity in internet social interactions:

    K. M. Christopherson, “The positive and negative implications of anonymity in internet social interactions: ”on the internet, nobody knows you’re a dog”,”Computers in Human Behavior, vol. 23, no. 6, pp. 3038– 3056, 2007

  13. [13]

    Fake profile detection techniques in large-scale online social networks: A comprehensive review,

    D. Ramalingam and V . Chinnaiah, “Fake profile detection techniques in large-scale online social networks: A comprehensive review,”Computers & Electrical Engineering, vol. 65, pp. 165–177, 2018

  14. [14]

    Understanding the effect of deplatforming on social networks,

    S. Ali, M. H. Saeed, E. Aldreabi, J. Blackburn, E. De Cristofaro, S. Zan- nettou, and G. Stringhini, “Understanding the effect of deplatforming on social networks,” inProceedings of the 13th ACM Web Science Conference 2021, 2021, pp. 187–195

  15. [15]

    The echo chamber effect on social media,

    M. Cinelli, G. De Francisci Morales, A. Galeazzi, W. Quattrociocchi, and M. Starnini, “The echo chamber effect on social media,”Proceedings of the national academy of sciences, vol. 118, no. 9, p. e2023301118, 2021

  16. [16]

    Making sense of danmu: Coherence in massive anonymous chats on bilibili. com,

    L.-T. Zhang and D. Cassany, “Making sense of danmu: Coherence in massive anonymous chats on bilibili. com,”Discourse Studies, vol. 22, no. 4, pp. 483–502, 2020

  17. [17]

    A data-driven study of view dura- tion on youtube,

    M. Park, M. Naaman, and J. Berger, “A data-driven study of view dura- tion on youtube,” inProceedings of the international AAAI conference on web and social media, vol. 10, no. 1, 2016, pp. 651–654

  18. [18]

    The stem cell hypothesis: Dilemma behind multi- task learning with transformer encoders,

    H. He and J. D. Choi, “The stem cell hypothesis: Dilemma behind multi- task learning with transformer encoders,” inProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 5555–5577

  19. [19]

    Emotion shapes the diffusion of moralized content in social networks,

    W. J. Brady, J. A. Wills, J. T. Jost, J. A. Tucker, and J. J. Van Bavel, “Emotion shapes the diffusion of moralized content in social networks,” Proceedings of the National Academy of Sciences, vol. 114, no. 28, pp. 7313–7318, 2017

  20. [20]

    Exposure to ideologically diverse news and opinion on facebook,

    E. Bakshy, S. Messing, and L. A. Adamic, “Exposure to ideologically diverse news and opinion on facebook,”Science, vol. 348, no. 6239, pp. 1130–1132, 2015