Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study

Derek Howard; Geoffrey Woollard; Jacob Ritchie; Justin Lee; Leon French; Marta Maslej

arxiv: 1907.02581 · v2 · pith:ONWJNFIEnew · submitted 2019-07-04 · 💻 cs.CL

Transfer Learning for Risk Classification of Social Media Posts: Model Evaluation Study

Derek Howard , Marta Maslej , Justin Lee , Jacob Ritchie , Geoffrey Woollard , Leon French This is my paper

Pith reviewed 2026-05-25 08:57 UTC · model grok-4.3

classification 💻 cs.CL

keywords transfer learningrisk classificationsocial mediamental healthGPT-1CLPsych 2017AutoMLfine-tuning

0 comments

The pith

Fine-tuning GPT-1 on 150000 unlabeled forum posts produces a new state-of-the-art risk classifier for mental health social media posts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that transfer learning via fine-tuning a pre-trained language model on large amounts of unlabeled text from the same domain improves classification of risk levels in mental health forum posts. Using only 1588 labeled examples from the CLPsych 2017 task, the approach outperforms lexicon-based features and other embeddings when combined with automated machine learning tools. A reader would care because it demonstrates a practical route to building triage systems for online support communities when labeled data is scarce and without relying on user metadata or prior posts.

Core claim

The top-performing system used features derived from the GPT-1 model, which was finetuned on over 150,000 unlabeled posts from Reachout.com. Our top system had a macro averaged F1 score of 0.572, providing a new state-of-the-art result on the CLPsych 2017 task. This was achieved without additional information from meta-data or preceding posts. We show that transfer learning is an effective strategy for predicting risk with relatively little labeled data and that finetuning of pretrained language models provides further gains when large amounts of unlabeled text is available.

What carries the argument

Fine-tuned GPT-1 embeddings used as input features for AutoML classifiers that assign one of four risk categories to each post.

If this is right

Risk classifiers can be built without access to user metadata or conversation history.
Domain-specific fine-tuning on unlabeled text yields measurable gains over off-the-shelf pre-trained models.
The resulting models still miss many expressions of hopelessness, indicating a remaining error pattern.
Visualizations of the learned decision boundaries can be produced to inspect what patterns the classifiers capture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fine-tuning recipe could be tested on risk classification tasks outside mental health, such as detecting other forms of distress in online text.
Newer language models could be substituted for GPT-1 to check whether further gains are available with the same unlabeled corpus.
The approach suggests a template for other low-labeled-data text triage problems where domain-specific unlabeled text is abundant.

Load-bearing premise

The 1588 labeled posts are representative of the risk classification task in general and fine-tuning on the unlabeled Reachout posts improves generalization rather than causing domain-specific overfitting.

What would settle it

Evaluation on a fresh set of labeled posts drawn from a different mental health forum where the fine-tuned GPT-1 system fails to exceed the performance of untuned embeddings or lexicon baselines.

read the original abstract

Mental illness affects a significant portion of the worldwide population. Online mental health forums can provide a supportive environment for those afflicted and also generate a large amount of data which can be mined to predict mental health states using machine learning methods. We benchmark multiple methods of text feature representation for social media posts and compare their downstream use with automated machine learning (AutoML) tools to triage content for moderator attention. We used 1588 labeled posts from the CLPsych 2017 shared task collected from the Reachout.com forum (Milne et al., 2019). Posts were represented using lexicon based tools including VADER, Empath, LIWC and also used pre-trained artificial neural network models including DeepMoji, Universal Sentence Encoder, and GPT-1. We used TPOT and auto-sklearn as AutoML tools to generate classifiers to triage the posts. The top-performing system used features derived from the GPT-1 model, which was finetuned on over 150,000 unlabeled posts from Reachout.com. Our top system had a macro averaged F1 score of 0.572, providing a new state-of-the-art result on the CLPsych 2017 task. This was achieved without additional information from meta-data or preceding posts. Error analyses revealed that this top system often misses expressions of hopelessness. We additionally present visualizations that aid understanding of the learned classifiers. We show that transfer learning is an effective strategy for predicting risk with relatively little labeled data. We note that finetuning of pretrained language models provides further gains when large amounts of unlabeled text is available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

They report a new high score on the old CLPsych 2017 benchmark by fine-tuning GPT-1 on 150k same-domain unlabeled posts, but the result rests on an untested transfer step.

read the letter

The paper's central result is a macro F1 of 0.572 on the CLPsych 2017 shared task, obtained by feeding GPT-1 features (after fine-tuning on over 150k unlabeled Reachout.com posts) into AutoML pipelines. This beats the numbers cited from prior work on the same 1588 labeled posts. They also run a range of other representations—VADER, LIWC, DeepMoji, Universal Sentence Encoder—through TPOT and auto-sklearn, and they include error analysis plus some classifier visualizations. That package is the useful part: a straightforward comparison of off-the-shelf text features on a real triage task with very little labeled data. The error analysis at least flags that the best system still misses hopelessness language, which is concrete and usable. The visualizations are a minor plus for interpretability. The soft spot is exactly the one the stress-test flags. There is no ablation that holds the AutoML pipeline fixed and swaps only the GPT-1 features (pre-trained vs. fine-tuned on the 150k posts). Without that, you cannot tell whether the fine-tuning step improves risk classification or simply encodes forum-specific patterns. The paper also needs to state explicitly that the unlabeled corpus excludes the official test partition; the shared source makes leakage plausible. The abstract gives no train-test split details, no cross-validation procedure, and no significance test, so the SOTA claim is only weakly supported until the methods section is checked. This work is for researchers who build or evaluate risk classifiers on mental-health forums and want a current baseline with transfer learning. It is not introducing new methods, so it will mainly be cited for the benchmark number if the fine-tuning benefit holds up. The experimental setup is honest enough and the task is important enough that a serious editor should send it to referees rather than desk-reject; the reviewers can ask for the missing ablation and leakage check. I would bring it to a reading group only if the group is focused on applied mental-health NLP.

Referee Report

3 major / 2 minor

Summary. The paper benchmarks lexicon-based (VADER, Empath, LIWC) and neural (DeepMoji, Universal Sentence Encoder, GPT-1) feature representations for risk classification of 1588 labeled posts from the CLPsych 2017 shared task on Reachout.com. Classifiers are generated via TPOT and auto-sklearn AutoML; the top system uses GPT-1 features after fine-tuning on >150k unlabeled posts from the same forum and reports a new state-of-the-art macro F1 of 0.572 without metadata or thread context. Error analysis and visualizations are also presented, with the conclusion that transfer learning plus domain-specific fine-tuning is effective for low-resource risk triage.

Significance. If the empirical claims hold after proper validation, the work provides concrete evidence that fine-tuning a pre-trained language model on large amounts of in-domain unlabeled text can improve downstream risk classification when labeled data are scarce (here only 1588 examples). This has direct applicability to online mental-health forum moderation and contributes a reproducible benchmark on a public shared-task dataset.

major comments (3)

[Abstract / Results] Abstract and experimental description: the headline claim of a new SOTA macro F1 of 0.572 is presented without any statement of the official CLPsych train/test split, cross-validation folds, number of AutoML runs, or statistical significance testing against prior systems; these details are load-bearing for the SOTA attribution.
[Methods / Results] Fine-tuning procedure (abstract and methods): no ablation is reported that isolates the effect of fine-tuning GPT-1 on the 150k unlabeled Reachout posts versus using the original pre-trained GPT-1 under identical AutoML pipelines and splits; without this comparison the transfer-learning benefit cannot be distinguished from domain adaptation or overfitting.
[Data / Methods] Data leakage risk (abstract): the 150k unlabeled posts are drawn from the identical Reachout.com source as the 1588 labeled CLPsych posts; the manuscript supplies no verification that the unlabeled corpus excludes the official test partition, which would invalidate the reported generalization.

minor comments (2)

[Methods] The description of the AutoML search spaces and hyper-parameter ranges for TPOT and auto-sklearn is not provided; adding these would improve reproducibility.
[Error Analysis] Error analysis notes that the top system misses expressions of hopelessness, but no quantitative breakdown (e.g., per-class confusion or example posts) is given to support this observation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the thorough review and valuable feedback on our manuscript. We appreciate the opportunity to clarify and strengthen our work. Below we provide point-by-point responses to the major comments, indicating where revisions will be made to the manuscript.

read point-by-point responses

Referee: [Abstract / Results] Abstract and experimental description: the headline claim of a new SOTA macro F1 of 0.572 is presented without any statement of the official CLPsych train/test split, cross-validation folds, number of AutoML runs, or statistical significance testing against prior systems; these details are load-bearing for the SOTA attribution.

Authors: We agree that these details are essential for supporting the SOTA claim. In the revised manuscript, we will update the abstract and results section to specify the official CLPsych 2017 train/test split used, the cross-validation folds in the AutoML process, the number of AutoML runs conducted, and include statistical significance testing against previous systems. revision: yes
Referee: [Methods / Results] Fine-tuning procedure (abstract and methods): no ablation is reported that isolates the effect of fine-tuning GPT-1 on the 150k unlabeled Reachout posts versus using the original pre-trained GPT-1 under identical AutoML pipelines and splits; without this comparison the transfer-learning benefit cannot be distinguished from domain adaptation or overfitting.

Authors: We acknowledge the importance of this ablation. The revised manuscript will include an ablation study comparing the fine-tuned GPT-1 features against the original pre-trained GPT-1 features using the same AutoML pipelines and data splits to clearly demonstrate the benefit of domain-specific fine-tuning. revision: yes
Referee: [Data / Methods] Data leakage risk (abstract): the 150k unlabeled posts are drawn from the identical Reachout.com source as the 1588 labeled CLPsych posts; the manuscript supplies no verification that the unlabeled corpus excludes the official test partition, which would invalidate the reported generalization.

Authors: This is a critical point. We will revise the methods and data description sections to provide explicit verification that the 150k unlabeled posts do not include any posts from the official CLPsych 2017 test partition, thereby confirming no data leakage occurred. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark on public shared-task data

full rationale

The paper reports a macro F1 of 0.572 obtained by extracting features from a GPT-1 model fine-tuned on 150k unlabeled Reachout.com posts, then training AutoML classifiers on the 1588 labeled CLPsych 2017 posts and evaluating on the official held-out test partition. No equation, definition, or self-citation reduces the reported F1 to a fitted parameter or to the input data by construction; the performance number is produced by standard supervised evaluation on externally supplied splits. The method description contains no self-definitional loop, no renaming of a known result as a derivation, and no load-bearing uniqueness theorem imported from the authors' prior work. The result is therefore self-contained against the public benchmark.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of the CLPsych 2017 labeled posts and the assumption that fine-tuning on unlabeled forum text yields genuine generalization gains rather than overfitting.

free parameters (1)

AutoML search parameters
TPOT and auto-sklearn optimize over classifier choices and hyperparameters on the given features.

axioms (1)

domain assumption The 1588 labeled posts are a sufficient and unbiased sample for claiming state-of-the-art performance on the risk triage task.
Invoked when stating the new SOTA result.

pith-pipeline@v0.9.0 · 5824 in / 1421 out tokens · 44057 ms · 2026-05-25T08:57:58.861134+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 10 internal anchors

[1]

Available from: https://apps.who.int/iris/bitstream/handle/10665/254610/WHO-MSD-MER-2017.2-eng.pdf

work page 2017
[2]

Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research

Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, Musacchio KM, Jaroszewski AC, Chang BP, Nock MK. Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research. Psychol Bull [Internet] 2017 Feb;143(2):187–232. PMID:27841450

work page 2017
[3]

Peer support among adults with serious mental illness: a report from the field

Davidson L, Chinman M, Sells D, Rowe M. Peer support among adults with serious mental illness: a report from the field. Schizophr Bull [Internet] 2006 Jul;32(3):443–450. PMID:16461576

work page 2006
[4]

Internet peer support for individuals with psychiatric disabilities: A randomized controlled trial

Kaplan K, Salzer MS, Solomon P, Brusilovskiy E, Cousounis P. Internet peer support for individuals with psychiatric disabilities: A randomized controlled trial. Soc Sci Med [Internet] 2011 Jan;72(1):54–62. PMID:21112682

work page 2011
[5]

The Use of Social Networking Sites in Mental Health Interventions for Young People: Systematic Review

Ridout B, Campbell A. The Use of Social Networking Sites in Mental Health Interventions for Young People: Systematic Review. J Med Internet Res [Internet] 2018 Dec 18;20(12):e12244. PMID:30563811

work page 2018
[6]

3267–3276

p. 3267–3276. [doi: 10.1145/2470654.2466447 ]

work page doi:10.1145/2470654.2466447
[7]

Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum

Kornfield R, Sarma PK, Shah DV, McTavish F, Landucci G, Pe-Romashko K, Gustafson DH. Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum. J Med Internet Res [Internet] 2018 Jun 12;20(6):e10136. PMID:29895517

work page 2018
[8]

Improving Moderator Responsiveness in Online Peer Support Through Automated Triage

Milne DN, McCabe KL, Calvo RA. Improving Moderator Responsiveness in Online Peer Support Through Automated Triage. J Med Internet Res [Internet] jmir.org; 2019 Apr 26;21(4):e11410. PMID:31025945

work page 2019
[9]

Available from: http://arxiv.org/abs/1806.05258

work page internal anchor Pith review Pith/arXiv arXiv
[10]

Available from: http://arxiv.org/abs/1709.01848

work page internal anchor Pith review Pith/arXiv arXiv
[11]

The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods

Tausczik YR, Pennebaker JW. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. J Lang Soc Psychol [Internet] SAGE Publications Inc; 2010 Mar 1;29(1):24–54. [doi: 10.1177/0261927X09351676 ]

work page doi:10.1177/0261927x09351676 2010
[12]

A meta-analysis of correlations between depression and first person singular pronoun use

Edwards T ’meisha, Holtzman NS. A meta-analysis of correlations between depression and first person singular pronoun use. J Res Pers [Internet] 2017 Jun 1;68:63–68. [doi: 10.1016/j.jrp.2017.02.005 ]

work page doi:10.1016/j.jrp.2017.02.005 2017
[13]

Clpsych 2016 shared task: Triaging content in online peer-support forums

Milne DN, Pink G, Hachey B, Calvo RA. Clpsych 2016 shared task: Triaging content in online peer-support forums. Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology [Internet]

work page 2016
[14]

Available from: http://arxiv.org/abs/1802.05365

work page internal anchor Pith review Pith/arXiv arXiv
[15]

Available from: http://arxiv.org/abs/1801.06146

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Improving language understanding by generative pre-training

Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. URL https://s3-us-west-2 amazonaws com/openai-assets/research-covers/languageunsupervised/language understanding paper pdf [Internet] 2018; Available from: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf

work page 2018
[17]

CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts

Zirikly A, Resnik P, Uzuner O, Hollingshead K. CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts. Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

work page 2019
[18]

p. 25–36. [doi: 10.18653/v1/W18-0603 ]

work page doi:10.18653/v1/w18-0603
[19]

p. 98–106. [doi: 10.1177/0706743718787795 ]

work page doi:10.1177/0706743718787795
[20]

Vader: A parsimonious rule-based model for sentiment analysis of social media text

Hutto CJ, Gilbert E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. AAAI conference on weblogs and social media [Internet] aaai.org; 2014; Available from: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8109

work page 2014
[21]

Available from: http://www.depts.ttu.edu/psy/lusi/files/LIWCmanual.pdf

Mahway: Lawrence Erlbaum Associates [Internet] 2001;71(2001):2001. Available from: http://www.depts.ttu.edu/psy/lusi/files/LIWCmanual.pdf

work page 2001
[22]

Empath: Understanding Topic Signals in Large-Scale Text

Fast E, Chen B, Bernstein MS. Empath: Understanding Topic Signals in Large-Scale Text. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems [Internet] New York, NY, USA: ACM

work page 2016
[23]

4647–4657

p. 4647–4657. [doi: 10.1145/2858036.2858535 ]

work page doi:10.1145/2858036.2858535
[24]

Available from: http://arxiv.org/abs/1708.00524

work page arXiv
[25]

Available from: http://arxiv.org/abs/1803.11175

work page internal anchor Pith review Pith/arXiv arXiv
[26]

spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing

Honnibal M, Montani I. spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear 2017

work page 2017
[27]

Scikit-learn: Machine Learning in Python

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É. Scikit-learn: Machine Learning in Python. J Mach Learn Res [Internet] 2011 [cited 2019 Jun 21];12(Oct):2825–2830. Available from: http://www.jmlr.org/papers/v12/pedregos...

work page 2011
[28]

Automating Biomedical Data Science Through Tree-Based Pipeline Optimization

Olson RS, Urbanowicz RJ, Andrews PC, Lavender NA, Kidd LC, Moore JH. Automating Biomedical Data Science Through Tree-Based Pipeline Optimization. Applications of Evolutionary Computation [Internet] Springer, Cham; 2016 [cited 2017 Sep 7]. p. 123–137. [doi: 10.1007/978-3-319-31204-0_9 ]

work page doi:10.1007/978-3-319-31204-0_9 2016
[29]

SciPy: Open source scientific tools for Python

Jones E, Oliphant T, Peterson P, Others. SciPy: Open source scientific tools for Python. 2001--

work page 2001
[30]

Available from: http://arxiv.org/abs/1705.00335

work page internal anchor Pith review Pith/arXiv arXiv
[31]

Multimodal Classification of Moderated Online Pro-Eating Disorder Content

Chancellor S, Kalantidis Y, Pater JA, De Choudhury M, Shamma DA. Multimodal Classification of Moderated Online Pro-Eating Disorder Content. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems [Internet] New York, NY, USA: ACM

work page 2017
[32]

3213–3226

p. 3213–3226. [doi: 10.1145/3025453.3025985 ]

work page doi:10.1145/3025453.3025985
[33]

Available from: http://arxiv.org/abs/1903.05987

work page internal anchor Pith review Pith/arXiv arXiv 1903
[34]

Available from: http://arxiv.org/abs/1901.11373

work page internal anchor Pith review Pith/arXiv arXiv 1901
[35]

Available from: http://arxiv.org/abs/1810.04805

work page internal anchor Pith review Pith/arXiv arXiv
[36]

Available from: http://arxiv.org/abs/1811.01088

work page internal anchor Pith review Pith/arXiv arXiv

[1] [1]

Available from: https://apps.who.int/iris/bitstream/handle/10665/254610/WHO-MSD-MER-2017.2-eng.pdf

work page 2017

[2] [2]

Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research

Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, Musacchio KM, Jaroszewski AC, Chang BP, Nock MK. Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research. Psychol Bull [Internet] 2017 Feb;143(2):187–232. PMID:27841450

work page 2017

[3] [3]

Peer support among adults with serious mental illness: a report from the field

Davidson L, Chinman M, Sells D, Rowe M. Peer support among adults with serious mental illness: a report from the field. Schizophr Bull [Internet] 2006 Jul;32(3):443–450. PMID:16461576

work page 2006

[4] [4]

Internet peer support for individuals with psychiatric disabilities: A randomized controlled trial

Kaplan K, Salzer MS, Solomon P, Brusilovskiy E, Cousounis P. Internet peer support for individuals with psychiatric disabilities: A randomized controlled trial. Soc Sci Med [Internet] 2011 Jan;72(1):54–62. PMID:21112682

work page 2011

[5] [5]

The Use of Social Networking Sites in Mental Health Interventions for Young People: Systematic Review

Ridout B, Campbell A. The Use of Social Networking Sites in Mental Health Interventions for Young People: Systematic Review. J Med Internet Res [Internet] 2018 Dec 18;20(12):e12244. PMID:30563811

work page 2018

[6] [6]

3267–3276

p. 3267–3276. [doi: 10.1145/2470654.2466447 ]

work page doi:10.1145/2470654.2466447

[7] [7]

Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum

Kornfield R, Sarma PK, Shah DV, McTavish F, Landucci G, Pe-Romashko K, Gustafson DH. Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum. J Med Internet Res [Internet] 2018 Jun 12;20(6):e10136. PMID:29895517

work page 2018

[8] [8]

Improving Moderator Responsiveness in Online Peer Support Through Automated Triage

Milne DN, McCabe KL, Calvo RA. Improving Moderator Responsiveness in Online Peer Support Through Automated Triage. J Med Internet Res [Internet] jmir.org; 2019 Apr 26;21(4):e11410. PMID:31025945

work page 2019

[9] [9]

Available from: http://arxiv.org/abs/1806.05258

work page internal anchor Pith review Pith/arXiv arXiv

[10] [10]

Available from: http://arxiv.org/abs/1709.01848

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods

Tausczik YR, Pennebaker JW. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. J Lang Soc Psychol [Internet] SAGE Publications Inc; 2010 Mar 1;29(1):24–54. [doi: 10.1177/0261927X09351676 ]

work page doi:10.1177/0261927x09351676 2010

[12] [12]

A meta-analysis of correlations between depression and first person singular pronoun use

Edwards T ’meisha, Holtzman NS. A meta-analysis of correlations between depression and first person singular pronoun use. J Res Pers [Internet] 2017 Jun 1;68:63–68. [doi: 10.1016/j.jrp.2017.02.005 ]

work page doi:10.1016/j.jrp.2017.02.005 2017

[13] [13]

Clpsych 2016 shared task: Triaging content in online peer-support forums

Milne DN, Pink G, Hachey B, Calvo RA. Clpsych 2016 shared task: Triaging content in online peer-support forums. Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology [Internet]

work page 2016

[14] [14]

Available from: http://arxiv.org/abs/1802.05365

work page internal anchor Pith review Pith/arXiv arXiv

[15] [15]

Available from: http://arxiv.org/abs/1801.06146

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

Improving language understanding by generative pre-training

Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. URL https://s3-us-west-2 amazonaws com/openai-assets/research-covers/languageunsupervised/language understanding paper pdf [Internet] 2018; Available from: https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf

work page 2018

[17] [17]

CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts

Zirikly A, Resnik P, Uzuner O, Hollingshead K. CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts. Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

work page 2019

[18] [18]

p. 25–36. [doi: 10.18653/v1/W18-0603 ]

work page doi:10.18653/v1/w18-0603

[19] [19]

p. 98–106. [doi: 10.1177/0706743718787795 ]

work page doi:10.1177/0706743718787795

[20] [20]

Vader: A parsimonious rule-based model for sentiment analysis of social media text

Hutto CJ, Gilbert E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. AAAI conference on weblogs and social media [Internet] aaai.org; 2014; Available from: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/viewPaper/8109

work page 2014

[21] [21]

Available from: http://www.depts.ttu.edu/psy/lusi/files/LIWCmanual.pdf

Mahway: Lawrence Erlbaum Associates [Internet] 2001;71(2001):2001. Available from: http://www.depts.ttu.edu/psy/lusi/files/LIWCmanual.pdf

work page 2001

[22] [22]

Empath: Understanding Topic Signals in Large-Scale Text

Fast E, Chen B, Bernstein MS. Empath: Understanding Topic Signals in Large-Scale Text. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems [Internet] New York, NY, USA: ACM

work page 2016

[23] [23]

4647–4657

p. 4647–4657. [doi: 10.1145/2858036.2858535 ]

work page doi:10.1145/2858036.2858535

[24] [24]

Available from: http://arxiv.org/abs/1708.00524

work page arXiv

[25] [25]

Available from: http://arxiv.org/abs/1803.11175

work page internal anchor Pith review Pith/arXiv arXiv

[26] [26]

spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing

Honnibal M, Montani I. spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear 2017

work page 2017

[27] [27]

Scikit-learn: Machine Learning in Python

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É. Scikit-learn: Machine Learning in Python. J Mach Learn Res [Internet] 2011 [cited 2019 Jun 21];12(Oct):2825–2830. Available from: http://www.jmlr.org/papers/v12/pedregos...

work page 2011

[28] [28]

Automating Biomedical Data Science Through Tree-Based Pipeline Optimization

Olson RS, Urbanowicz RJ, Andrews PC, Lavender NA, Kidd LC, Moore JH. Automating Biomedical Data Science Through Tree-Based Pipeline Optimization. Applications of Evolutionary Computation [Internet] Springer, Cham; 2016 [cited 2017 Sep 7]. p. 123–137. [doi: 10.1007/978-3-319-31204-0_9 ]

work page doi:10.1007/978-3-319-31204-0_9 2016

[29] [29]

SciPy: Open source scientific tools for Python

Jones E, Oliphant T, Peterson P, Others. SciPy: Open source scientific tools for Python. 2001--

work page 2001

[30] [30]

Available from: http://arxiv.org/abs/1705.00335

work page internal anchor Pith review Pith/arXiv arXiv

[31] [31]

Multimodal Classification of Moderated Online Pro-Eating Disorder Content

Chancellor S, Kalantidis Y, Pater JA, De Choudhury M, Shamma DA. Multimodal Classification of Moderated Online Pro-Eating Disorder Content. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems [Internet] New York, NY, USA: ACM

work page 2017

[32] [32]

3213–3226

p. 3213–3226. [doi: 10.1145/3025453.3025985 ]

work page doi:10.1145/3025453.3025985

[33] [33]

Available from: http://arxiv.org/abs/1903.05987

work page internal anchor Pith review Pith/arXiv arXiv 1903

[34] [34]

Available from: http://arxiv.org/abs/1901.11373

work page internal anchor Pith review Pith/arXiv arXiv 1901

[35] [35]

Available from: http://arxiv.org/abs/1810.04805

work page internal anchor Pith review Pith/arXiv arXiv

[36] [36]

Available from: http://arxiv.org/abs/1811.01088

work page internal anchor Pith review Pith/arXiv arXiv