Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning
Pith reviewed 2026-05-25 01:28 UTC · model grok-4.3
The pith
A convolutional neural network predicts Twitter user gender at 82 percent accuracy and finds men and women differ in HPV vaccine perceptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose a convolutional neural network model for gender prediction using English Twitter text as input. Ensemble of the proposed model achieved an accuracy at 0.8237 on gender prediction and compared favorably with the state-of-the-art performance in a recent author profiling task. They further leveraged the trained models to predict the gender labels from an HPV vaccine related corpus and identified gender difference in public perceptions regarding HPV vaccine. The findings are largely consistent with previous survey-based studies.
What carries the argument
Ensemble of convolutional neural networks that classify gender from tweet text and then label an HPV vaccine tweet corpus for perception comparison.
If this is right
- Gender differences in HPV vaccine perceptions can be measured at large scale from social media without new surveys.
- The approach provides performance comparable to existing methods for predicting author attributes from text.
- Twitter-based analysis can serve as a complement to traditional surveys for monitoring public health attitudes.
- Targeted health communication can draw on real-time gender-specific sentiment patterns extracted from posts.
Where Pith is reading between the lines
- The same pipeline could be tested on other vaccines or health topics to check whether gender perception gaps appear consistently.
- If the model generalizes across topics, public health agencies could build ongoing dashboards of demographic opinion shifts.
- Extensions might test whether adding user metadata such as location improves the reliability of the gender labels.
Load-bearing premise
A gender classifier trained on general Twitter data transfers to the HPV vaccine tweet corpus without substantial loss of accuracy from topic shift or label errors.
What would settle it
Manual annotation of gender for a held-out sample of HPV vaccine tweets, followed by measurement of the model's accuracy on that sample; accuracy well below 0.82 would falsify reliable transfer.
Figures
read the original abstract
In this study, we proposed a convolutional neural network model for gender prediction using English Twitter text as input. Ensemble of proposed model achieved an accuracy at 0.8237 on gender prediction and compared favorably with the state-of-the-art performance in a recent author profiling task. We further leveraged the trained models to predict the gender labels from an HPV vaccine related corpus and identified gender difference in public perceptions regarding HPV vaccine. The findings are largely consistent with previous survey-based studies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a convolutional neural network model for gender prediction from English Twitter text. An ensemble of the model achieves 0.8237 accuracy on this task and compares favorably to state-of-the-art results in a recent author profiling shared task. The trained models are applied to label gender in an HPV vaccine-related tweet corpus, after which gender differences in public perceptions are identified; these differences are reported to be largely consistent with prior survey-based studies.
Significance. If the gender labels transfer reliably to the HPV corpus, the work illustrates a scalable deep-learning approach for demographic stratification of social-media perceptions on public-health topics, providing a potential complement to traditional surveys. The reported consistency with survey literature is a strength, but the absence of target-domain validation substantially weakens the evidential basis for the perception-difference claims.
major comments (2)
- [Abstract] Abstract: the reported accuracy of 0.8237 applies only to the general Twitter gender-prediction task. No accuracy, confusion matrix, calibration check, or manual validation is supplied for the gender labels assigned to the HPV-vaccine tweets, leaving the downstream perception analysis vulnerable to domain shift.
- [Abstract] Abstract: no dataset sizes, cross-validation details, error analysis, or statistical test of the reported perception differences are provided, so the robustness and significance of the gender-stratified findings cannot be assessed from the given information.
minor comments (1)
- [Abstract] The abstract would be clearer if it stated the size of the HPV corpus and the number of tweets to which gender labels were assigned.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting important aspects of validation and reporting. We address each major comment below and outline revisions to improve the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reported accuracy of 0.8237 applies only to the general Twitter gender-prediction task. No accuracy, confusion matrix, calibration check, or manual validation is supplied for the gender labels assigned to the HPV-vaccine tweets, leaving the downstream perception analysis vulnerable to domain shift.
Authors: We agree that no target-domain validation (accuracy, confusion matrix, or manual checks) is provided for gender labels on the HPV-vaccine tweets, as ground-truth annotations are unavailable for this corpus. This leaves open the possibility of domain shift. The model was trained and evaluated on a large general Twitter dataset from the PAN 2018 task, and HPV tweets originate from the same platform. We will add a limitations subsection explicitly discussing domain shift risks and their implications for the perception findings. We will also include the confusion matrix from the gender prediction task. However, we cannot supply accuracy metrics on the HPV tweets without new labeled data. revision: partial
-
Referee: [Abstract] Abstract: no dataset sizes, cross-validation details, error analysis, or statistical test of the reported perception differences are provided, so the robustness and significance of the gender-stratified findings cannot be assessed from the given information.
Authors: Dataset sizes for the gender prediction training set and the HPV corpus are reported in the methods section, along with cross-validation details and error analysis for the CNN ensemble. We acknowledge that statistical tests for the gender differences in perceptions (e.g., topic distributions) are not included. We will add appropriate statistical tests to the results section and revise the abstract to include key dataset sizes, cross-validation information, and a note on the statistical analysis of differences. revision: yes
- Absence of target-domain validation metrics for gender labels on the HPV-vaccine tweets, as no ground-truth gender data exists for this corpus without additional annotation.
Circularity Check
No circularity: empirical pipeline with external consistency check
full rationale
The paper trains a CNN gender classifier on general Twitter data, reports held-out accuracy (0.8237), then applies the fixed model to a separate HPV-vaccine tweet corpus and compares the resulting gender-stratified perception patterns against independent survey literature. No equations, fitted parameters, or self-citations are used to derive the central claims; the gender labels on the target corpus are produced by a model whose performance was measured on a disjoint distribution, and the perception differences are validated externally rather than by construction. This is a standard supervised transfer application whose validity rests on untested domain-shift assumptions, not on any self-referential reduction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
and studied the how vaccine tweets counts vary across genders. However, Demographer and other name-based inferring tools [30,31] suffer limitations when the name information of Twitter users is not available or accurate. 2019 KDD DSHealth 2019, Anchorage, AK Du et al. Previous efforts framed gender prediction as binary classification tasks and proposed ma...
work page 2019
-
[2]
to evaluate a convolutional neural network-based deep learning model for English Twitter gender prediction and evaluate the model on the most recent open challenge task: 6th Author Profiling Task at PAN 2018 [20]
work page 2018
-
[3]
to leverage the trained model on the gender prediction of Twitter users who have discussed HPV vaccine related topics and investigate the gender difference on the public perceptions regarding HPV vaccines. 2.1 Datasets Author profiling tasks at PAN is a series of international challenges which aim to classify the texts into classes based on the stylistic ...
work page 2018
-
[4]
and one Theory of Planned Behavior (TPB) construct (i.e. attitude)[10]. The gold standard data was used to train and evaluate the attentive recurrent neural network described in [9]. We repeated random sampling of the tweets and the training 10 times for each construct and the final prediction of the all the un-labeled tweets was based on the community en...
work page 2018
-
[5]
for pre-processing (e.g. user name normalization, URL normalization, lowercase), then leveraged NLTK 3.4.1 TweetTokenizer for tokenization and Taggers for Part of Speech 2019 KDD DSHealth 2019, Anchorage, AK (POS) Tagging. For each user, we combined all the 100 tweets into a single document, which was served as the input for the machine learning and deep ...
work page 2019
-
[6]
The ensemble model improved the accuracy for all the algorithms Table 2 Comparison of algorithms on Twitter gender prediction SVM RNN CNN CNN _char CNN_char _pos Mean 0.7902 0.7874 0.8019 0.8127 0.8128 SD 0.0035 0.0106 0.0066 0.0018 0.0060 Voting 0.7968 0.8047 0.8153 0.8189 0.8237 3.2 Gender difference in public perceptions The trained ensemble of CNN_cha...
work page 2014
-
[7]
found men scored higher on the perceived barriers to HPV vaccine, while lower on perceived severity, perceived benefit than did women in a population of African-American college students. One limitation of this study is that we treated predicted gender as true gender in following Chi-square test, which could lead to information bias due to the misclassifi...
work page 2019
-
[8]
Knowledge, beliefs, and behaviors: Examining human papillomavirus-related gender differences among african American College Students. J. Am. Coll. Heal. (2011). DOI:https://doi.org/10.1080/07448481.2010.503725
-
[9]
United States 1998, (1990), 243–248
Achievements in public health, 1990-1999: impact of vaccines universally recommended for children. United States 1998, (1990), 243–248
work page 1990
-
[10]
Inflation of type I error rates due to differential misclassification in EHR-derived outcomes: Empirical illustration using breast cancer recurrence. Pharmacoepidemiol. Drug Saf. (2019). DOI:https://doi.org/10.1002/pds.4680
-
[11]
Understanding Vaccine Refusal: Why We Need Social Media Now. Am. J. Prev. Med. (2016). DOI:https://doi.org/10.1016/j.amepre.2015.10.002
- [12]
-
[13]
Optimization on machine learning based approaches for sentiment analysis on HPV vaccines related tweets. J. Biomed. Semantics 8, 1 (2017),
work page 2017
-
[14]
DOI:https://doi.org/10.1186/s13326-017-0120-6
-
[15]
An Empirical Study for Impacts of Measurement Errors on EHR based Association Studies. AMIA ... Annu. Symp. proceedings. AMIA Symp. (2016)
work page 2016
-
[16]
Examining Patterns of Influenza Vaccination in Social Media. Work. Thirty-First AAAI Conf. Artif. Intell. (2017), 1–5. Retrieved from http://www.cs.jhu.edu/~mdredze/publications/2017_w3phi_vaccines.pdf
work page 2017
-
[17]
Gender differences in knowledge and health beliefs related to behavioral intentions to prevent human papillomavirus infection. Asia-Pacific J. Public Heal. (2013). DOI:https://doi.org/10.1177/1010539512444307
-
[18]
Demographer: Extremely simple name demographics. NLP+ CSS 2016 (2016),
work page 2016
-
[19]
Measuring vaccine confidence: Analysis of data obtained by a media surveillance system used to analyse public concerns about vaccines. Lancet Infect. Dis. 13, 7 (2013), 606–613. DOI:https://doi.org/10.1016/S1473-3099(13)70108-7
-
[20]
Reframing medicine’s publics: The local as a public of vaccine refusal. J. Med. Humanit. 35, 2 (2014), 111–129
work page 2014
-
[21]
Pediatrics 135, 2 (2015), 280–289
Geographic clusters in underimmunization and vaccine refusal. Pediatrics 135, 2 (2015), 280–289
work page 2015
-
[22]
Overview of the 6th author profiling task at pan 2018: multimodal gender identification in Twitter. Work. Notes Pap. CLEF (2018)
work page 2018
-
[23]
Overview of the 5th author profiling task at PAN 2017: Gender and language variety identification in Twitter. In CEUR Workshop Proceedings
work page 2017
-
[24]
Developing age and gender predictive lexica over social media. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1146–1151. DOI:https://doi.org/10.3115/v1/d14-1121
-
[25]
US assessment of HPV types in cancers: implications for current and 9-valent HPV vaccines. JNCI J. Natl. Cancer Inst. 107, 6 (2015)
work page 2015
-
[26]
Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13, 3 (2018), 55–75. DOI:https://doi.org/10.1109/MCI.2018.2840738
-
[27]
A survey of location prediction on Twitter. IEEE Trans. Knowl. Data Eng. (2018)
work page 2018
-
[28]
Measles outbreak—California, December 2014-February
work page 2014
-
[29]
MMWR Morb Mortal Wkly Rep 64, 6 (2015), 153–154
work page 2015
-
[30]
Retrieved April 26, 2019 from https://genderize.io/
Genderize.io | Determine the gender of a first name. Retrieved April 26, 2019 from https://genderize.io/
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.