Danish Stance Classification and Rumour Resolution
Pith reviewed 2026-05-25 11:17 UTC · model grok-4.3
The pith
A linear SVM classifies Danish stances at 76 percent accuracy and feeds an HMM that predicts rumour veracity at 83 percent accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper generates a stance-annotated Danish Reddit dataset and shows that a Linear Support Vector Machine achieves the best stance classification results with an accuracy of 0.76 and macro F1 score of 0.42. It further shows that stance labels fed into a Hidden Markov Model can predict the veracity of rumours, reaching an accuracy of 0.83 and F1 of 0.68 when trained and tested on the Danish dataset alone. The model also works across languages and platforms, and using automatic stance labels causes only a small drop in performance.
What carries the argument
The linear support vector machine for stance classification combined with a hidden Markov model that treats sequences of stance labels as observations to infer rumour veracity.
If this is right
- Stance classification transfers reasonably well from English Twitter data to Danish Reddit posts.
- Rumour veracity can be estimated from stance labels alone using an HMM without needing other features.
- Automatic stance classification is close enough to manual labels to support practical veracity prediction systems.
- Performance improves when the HMM is trained on language-specific data rather than cross-lingual data.
Where Pith is reading between the lines
- Similar stance-to-veracity pipelines could be built for other languages by creating small annotated datasets and reusing the HMM structure.
- The fact that cross-platform transfer works suggests that stance patterns are somewhat universal across social media.
- Future work could test whether adding temporal features or user metadata to the HMM would raise the veracity scores further.
Load-bearing premise
The stance annotations collected for the new Danish Reddit dataset are sufficiently accurate and representative that downstream HMM veracity predictions reflect genuine signal rather than annotation artifacts or domain mismatch.
What would settle it
Manually re-annotating a sample of the Danish Reddit posts and finding that the original stance labels disagree with the new annotations at a high rate would indicate that the reported HMM veracity accuracies may be inflated by annotation errors.
read the original abstract
The Internet is rife with flourishing rumours that spread through microblogs and social media. Recent work has shown that analysing the stance of the crowd towards a rumour is a good indicator for its veracity. One state-of-the-art system uses an LSTM neural network to automatically classify stance for posts on Twitter by considering the context of a whole branch, while another, more simple Decision Tree classifier, performs at least as well by performing careful feature engineering. One approach to predict the veracity of a rumour is to use stance as the only feature for a Hidden Markov Model (HMM). This thesis generates a stance-annotated Reddit dataset for the Danish language, and implements various models for stance classification. Out of these, a Linear Support Vector Machine provides the best results with an accuracy of 0.76 and macro F1 score of 0.42. Furthermore, experiments show that stance labels can be used across languages and platforms with a HMM to predict the veracity of rumours, achieving an accuracy of 0.82 and F1 score of 0.67. Even higher scores are achieved by relying only on the Danish dataset. In this case veracity prediction scores an accuracy of 0.83 and an F1 of 0.68. Finally, when using automatic stance labels for the HMM, only a small drop in performance is observed, showing that the implemented system can have practical applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a new stance-annotated Danish Reddit dataset for rumours, evaluates multiple stance classification models (with Linear SVM achieving best results of 0.76 accuracy and 0.42 macro F1), and shows that gold or automatic stance labels can be fed into an HMM to predict rumour veracity, reaching 0.83 accuracy and 0.68 F1 on the Danish data (with slightly lower cross-lingual/platform results of 0.82/0.67).
Significance. If the stance annotations prove reliable, the work would be significant for demonstrating that stance-based veracity prediction via HMM transfers across languages (English to Danish) and platforms (Twitter to Reddit), and that automatic stance labels incur only a small performance drop, supporting practical deployment. The concrete performance numbers and the use of a simple, interpretable HMM are strengths.
major comments (3)
- [Dataset creation section] Dataset creation section: No inter-annotator agreement, annotation guidelines, or label distribution statistics are reported for the new Danish Reddit stance dataset. This is load-bearing for the central claim because the HMM veracity accuracies (0.83/0.68) depend directly on the quality of these stance labels; the modest macro F1 of 0.42 on the SVM stance classifier is consistent with either severe imbalance or label noise that could make downstream results reflect annotation artifacts.
- [Experimental results section] Experimental results section: No train/test split details, error bars, or statistical significance tests are provided for the reported accuracy and F1 scores on either stance classification or HMM veracity prediction. This undermines confidence in the headline numbers (0.76/0.42 for stance; 0.83/0.68 for veracity) and the claim that automatic stance yields only a small drop.
- [Stance classification experiments] Stance classification experiments: Only internal model variants are compared; no external baselines from prior stance detection literature are included, making it impossible to assess whether the Linear SVM result represents a meaningful advance for Danish data.
minor comments (2)
- [Abstract and results] The abstract and results text would benefit from explicit statements of the number of rumours/posts in the Danish dataset and the class distribution to contextualize the macro F1 scores.
- [HMM veracity section] Notation for the HMM states and transition probabilities is introduced without a clear diagram or equation reference, making the veracity prediction pipeline harder to follow.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each of the major comments below and indicate the revisions we plan to make.
read point-by-point responses
-
Referee: Dataset creation section: No inter-annotator agreement, annotation guidelines, or label distribution statistics are reported for the new Danish Reddit stance dataset. This is load-bearing for the central claim because the HMM veracity accuracies (0.83/0.68) depend directly on the quality of these stance labels; the modest macro F1 of 0.42 on the SVM stance classifier is consistent with either severe imbalance or label noise that could make downstream results reflect annotation artifacts.
Authors: We agree that reporting inter-annotator agreement, annotation guidelines, and label distributions is essential. The dataset was annotated following adapted guidelines from English rumour stance datasets, but due to the scope of the thesis work, IAA was not calculated. We will add the label distribution and a summary of the guidelines to the dataset creation section in the revision. The high veracity prediction accuracy with gold labels supports that the annotations are of sufficient quality for this task, though we acknowledge the modest stance F1 may indicate class imbalance. revision: partial
-
Referee: Experimental results section: No train/test split details, error bars, or statistical significance tests are provided for the reported accuracy and F1 scores on either stance classification or HMM veracity prediction. This undermines confidence in the headline numbers (0.76/0.42 for stance; 0.83/0.68 for veracity) and the claim that automatic stance yields only a small drop.
Authors: We will revise the experimental results section to include explicit details on the train/test splits used. For error bars and significance tests, we will perform additional experiments with multiple random seeds to provide these statistics in the revised manuscript. revision: yes
-
Referee: Stance classification experiments: Only internal model variants are compared; no external baselines from prior stance detection literature are included, making it impossible to assess whether the Linear SVM result represents a meaningful advance for Danish data.
Authors: The primary goal was to establish baseline performance for the new Danish dataset by comparing several standard models internally. We will expand the related work section to reference prior stance detection approaches and discuss why direct comparison is challenging across languages, while noting that the SVM outperforms the other models tested on this data. revision: partial
Circularity Check
No circularity; empirical results on held-out data
full rationale
The paper reports standard machine-learning performance numbers (SVM stance classification accuracy 0.76 / macro F1 0.42; HMM veracity prediction accuracy 0.83 / F1 0.68) obtained by training on one portion of the newly collected Danish Reddit dataset and evaluating on held-out data. No equations, fitted parameters, or self-citations reduce these test-set metrics to quantities defined on the same test instances. The stance labels serve as input features for the HMM; the reported veracity scores are not forced by construction from the stance-classifier training objective. No self-definitional, fitted-input-called-prediction, or self-citation-load-bearing steps appear.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Linear Support Vector Machine provides the best results with an accuracy of 0.76 and macro F1 score of 0.42 on Danish stance classification; stance labels fed to an HMM yield veracity prediction accuracy 0.83 and F1 0.68
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Hidden Markov Model (HMM) ... stance as the only feature
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.