pith. sign in

arxiv: 2010.02494 · v1 · submitted 2020-10-06 · 💻 cs.CL

Help! Need Advice on Identifying Advice

Pith reviewed 2026-05-24 14:42 UTC · model grok-4.3

classification 💻 cs.CL
keywords advice identificationpragmaticsreddit datasetnatural language processingsentence annotationlanguage modelsemotional support
0
0 comments X

The pith

A new annotated dataset from Reddit shows that identifying advice in text remains challenging for language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a dataset of sentences from two Reddit advice forums, labeled for whether they contain advice or not. The authors analyze the linguistic features of advice, which can be explicit or implicit and mixed with emotional support. They test preliminary models and find that pre-trained language models perform better than simple rule-based approaches but still fall short, indicating the task is difficult. This work aims to improve understanding of language pragmatics in advice-giving and seeking scenarios.

Core claim

The authors present an English dataset from r/AskParents and r/needadvice annotated at the sentence level for advice presence. Analysis shows rich linguistic phenomena in advice discourse. Preliminary models demonstrate that pre-trained language models capture advice better than rule-based systems, yet advice identification is challenging.

What carries the argument

Sentence-level annotation scheme distinguishing advice (explicit and implicit) from non-advice like emotional support in online forum posts.

If this is right

  • Online advice forums could automatically surface advice content more efficiently.
  • Natural language generation systems could produce more targeted advice responses.
  • Better models for pragmatic language understanding would follow from improved advice detection.
  • The dataset enables targeted research on implicit versus explicit advice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same annotation approach could extend to other online communities to compare advice styles.
  • Context from surrounding sentences or user history might improve model performance beyond sentence-level input.
  • If advice identification improves, it could support tools that help users seek or give advice more effectively.

Load-bearing premise

Human annotators can produce reliable labels that distinguish advice including implicit advice from non-advice such as emotional support.

What would settle it

A replication study with independent annotators showing low agreement on advice labels, or a new model reaching near-human accuracy on the held-out test set.

Figures

Figures reproduced from arXiv: 2010.02494 by Benjamin T Chen, Junyi Jessy Li, Katrin Erk, Rebecca Warholic, Venkata Subrahmanyan Govindarajan.

Figure 1
Figure 1. Figure 1: Frequency of discourse connective though. X-axis: Frequency, Y-axis: Percentage progress through a reply, 0 is beginning and 100 is end of reply. identify a text’s contribution through clusters of linguistic features including temporal progression, stative vs. generic sentences, etc. We found that personal narrative is often expressed in the nar￾rative discourse mode, as shown in example (5) above. For non… view at source ↗
Figure 2
Figure 2. Figure 2: Attention distribution of a reply to a post ti [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Attention distribution of a reply to a post ti [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Humans use language to accomplish a wide variety of tasks - asking for and giving advice being one of them. In online advice forums, advice is mixed in with non-advice, like emotional support, and is sometimes stated explicitly, sometimes implicitly. Understanding the language of advice would equip systems with a better grasp of language pragmatics; practically, the ability to identify advice would drastically increase the efficiency of advice-seeking online, as well as advice-giving in natural language generation systems. We present a dataset in English from two Reddit advice forums - r/AskParents and r/needadvice - annotated for whether sentences in posts contain advice or not. Our analysis reveals rich linguistic phenomena in advice discourse. We present preliminary models showing that while pre-trained language models are able to capture advice better than rule-based systems, advice identification is challenging, and we identify directions for future research. Comments: To be presented at EMNLP 2020.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents a new English dataset of sentences drawn from two Reddit advice forums (r/AskParents and r/needadvice) annotated for the presence or absence of advice. It provides a linguistic analysis of advice discourse and reports preliminary experiments in which pre-trained language models outperform rule-based baselines, while noting that advice identification remains challenging.

Significance. If the annotations are shown to be reliable, the released dataset would constitute a useful resource for computational pragmatics research on advice as a speech act, with downstream relevance to dialogue systems and online forum tools. The paper's explicit release of an annotated corpus is a concrete strength that can be built upon regardless of the preliminary modeling results.

major comments (2)
  1. [§3] §3 (Dataset Creation and Annotation): No inter-annotator agreement figures, annotation guidelines, or adjudication procedure are reported. This is load-bearing for the central claim because the abstract itself emphasizes that advice is “sometimes stated explicitly, sometimes implicitly” and is mixed with emotional support; without evidence that annotators reached stable consensus on these boundary cases, the dataset cannot yet serve as a reliable benchmark.
  2. [§5] §5 (Experiments): The claim that “pre-trained language models are able to capture advice better than rule-based systems” is presented without any performance metrics, model architectures, hyper-parameters, or error analysis. This prevents evaluation of whether the reported advantage is substantive or merely an artifact of the (unvalidated) labels.
minor comments (1)
  1. [Abstract] The abstract is somewhat long and could be tightened by moving the high-level motivation to the introduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which identifies key areas where additional detail will strengthen the manuscript. We address each major comment below and will revise the paper to incorporate the requested information.

read point-by-point responses
  1. Referee: [§3] §3 (Dataset Creation and Annotation): No inter-annotator agreement figures, annotation guidelines, or adjudication procedure are reported. This is load-bearing for the central claim because the abstract itself emphasizes that advice is “sometimes stated explicitly, sometimes implicitly” and is mixed with emotional support; without evidence that annotators reached stable consensus on these boundary cases, the dataset cannot yet serve as a reliable benchmark.

    Authors: We agree that inter-annotator agreement, annotation guidelines, and the adjudication procedure are essential to demonstrate dataset reliability, particularly for the implicit/explicit and advice/support boundary cases highlighted in the abstract. These details were collected during annotation but omitted from the initial submission. In the revised manuscript we will add the annotation guidelines (as an appendix), report inter-annotator agreement (Cohen’s kappa), and describe the adjudication process. This directly addresses the concern and supports the dataset’s value as a benchmark. revision: yes

  2. Referee: [§5] §5 (Experiments): The claim that “pre-trained language models are able to capture advice better than rule-based systems” is presented without any performance metrics, model architectures, hyper-parameters, or error analysis. This prevents evaluation of whether the reported advantage is substantive or merely an artifact of the (unvalidated) labels.

    Authors: We acknowledge that the current §5 provides only a high-level statement of the preliminary results without the supporting metrics, architectures, hyperparameters, or error analysis needed for proper evaluation. We will expand this section to report concrete performance numbers (e.g., precision, recall, F1), the specific pre-trained models and rule-based baselines used, hyperparameter settings, and a brief error analysis of difficult cases. These additions will allow readers to assess the substantive nature of the reported advantage. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset creation with no derivations or self-referential predictions

full rationale

The paper presents an annotated dataset from Reddit forums and preliminary model comparisons for advice identification. No mathematical derivations, fitted parameters renamed as predictions, uniqueness theorems, or self-citation chains appear in the abstract or described content. The contribution is data collection and empirical evaluation against rule-based baselines; no step reduces by construction to its own inputs. This is the expected non-finding for a resource paper without theoretical claims.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are visible from the abstract; the work consists of data annotation and empirical modeling.

pith-pipeline@v0.9.0 · 5700 in / 1004 out tokens · 29547 ms · 2026-05-24T14:42:17.280449+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 2 internal anchors

  1. [1]

    URL: " 'urlintro :=

    ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

  3. [3]

    Hossein Abolfathiasl and Ain Nadzimah Abdullah. 2013. https://doi.org/10.7575/aiac.ijalel.v.2n.6p.236 P ragmatic S trategies and L inguistic S tructures in M aking ‘ S uggestions’: T owards C omprehensive T axonomies . International Journal of Applied Linguistics and English Literature, 2(6):236--241

  4. [4]

    Isabel Cachola, Eric Holgate, Daniel Preo t iuc-Pietro, and Junyi Jessy Li. 2018. https://www.aclweb.org/anthology/C18-1248 Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media . In Proceedings of the 27th International Conference on Computational Linguistics, pages 2927--2938, Santa Fe, New Mexico, US...

  5. [5]

    Yen-Yuan Chen, Chia-Ming Li, Jyh-Chong Liang, and Chin-Chung Tsai. 2018. https://doi.org/10.2196/jmir.9370 Health I nformation O btained F rom the I nternet and C hanges in M edical D ecision M aking: Q uestionnaire D evelopment and C ross- S ectional S urvey . Journal of Medical Internet Research, 20(2):e47

  6. [6]

    Le, and Christopher D

    Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. https://openreview.net/forum?id=r1xMH1BtvB ELECTRA : P re-training T ext E ncoders as D iscriminators R ather T han G enerators . In International Conference on Learning Representations

  7. [7]

    Jacob Cohen. 1960. https://doi.org/10.1177/001316446002000104 A C oefficient of A greement for N ominal S cales . Educational and Psychological Measurement, 20(1):37--46. Publisher: SAGE Publications Inc

  8. [8]

    A. P. Dawid and A. M. Skene. 1979. https://doi.org/10.2307/2346806 M aximum L ikelihood E stimation of O bserver E rror- R ates U sing the EM A lgorithm . Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):20--28

  9. [9]

    Andrea DeCapua and Joan Findlay Dunham. 1993. https://doi.org/10.1016/0378-2166(93)90014-G Strategies in the discourse of advice . Journal of Pragmatics, 20(6):519--531

  10. [10]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://www.aclweb.org/anthology/N19-1423 BERT : P re-training of D eep B idirectional T ransformers for L anguage U nderstanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume...

  11. [11]

    Allyson Ettinger. 2020. https://doi.org/10.1162/tacl\_a\_00298 What BERT I s N ot: L essons from a N ew S uite of P sycholinguistic D iagnostics for L anguage M odels . Transactions of the Association for Computational Linguistics, 8:34--48

  12. [12]

    Susannah Fox and Maeve Duggan. 2013. https://www.pewresearch.org/internet/2013/01/15/information-triage/ Information T riage . Pew Research Center: Internet, Science & Tech

  13. [13]

    Chang, and Cristian Danescu-Niculescu-Mizil

    Liye Fu, Jonathan P. Chang, and Cristian Danescu-Niculescu-Mizil. 2019. https://doi.org/10.18653/v1/N19-1052 A sking the R ight Q uestion: I nferring A dvice- S eeking I ntentions from P ersonal N arratives . In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, ...

  14. [14]

    Ilya Loshchilov and Frank Hutter. 2019. https://openreview.net/forum?id=Bkg6RiCqY7 D ecoupled W eight D ecay R egularization . In International Conference on Learning Representations

  15. [15]

    William C Mann and Sandra A Thompson. 1988. https://doi.org/https://doi.org/10.1515/text.1.1988.8.3.243 Rhetorical S tructure T heory: T oward a functional theory of text organization . Text & Talk, 8(3):243--281

  16. [16]

    Monroe, Michael P

    Burt L. Monroe, Michael P. Colaresi, and Kevin M. Quinn. 2017. https://doi.org/10.1093/pan/mpn018 Fightin' W ords: L exical F eature S election and E valuation for I dentifying the C ontent of P olitical C onflict . Political Analysis, 16(4):372–403

  17. [17]

    Sapna Negi, Tobias Daudert, and Paul Buitelaar. 2019. https://doi.org/10.18653/v1/S19-2151 S em E val-2019 T ask 9: S uggestion M ining from O nline R eviews and F orums . In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 877--887, Minneapolis, Minnesota, USA. Association for Computational Linguistics

  18. [18]

    Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, and Byron Wallace. 2018. https://doi.org/10.18653/v1/P18-1019 A C orpus with M ulti- L evel A nnotations of P atients, I nterventions and O utcomes to S upport L anguage P rocessing for M edical L iterature . In Proceedings of the 56th Annual Meeting of the Association for ...

  19. [19]

    Benjamin Nye and Ani Nenkova. 2015. https://doi.org/10.3115/v1/N15-1166 Identification and C haracterization of N ewsworthy V erbs in W orld N ews . In Proceedings of the 2015 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 1440--1445, Denver, Colorado. Association for Computa...

  20. [20]

    Natalie Parde and Rodney Nielsen. 2017. https://doi.org/10.18653/v1/D17-1204 Finding P atterns in N oisy C rowds: R egression-based A nnotation A ggregation for C rowdsourced D ata . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1907--1912, Copenhagen, Denmark. Association for Computational Linguistics

  21. [21]

    Rolandos Alexandros Potamias, Alexandros Neofytou, and Georgios Siolas. 2019. https://doi.org/10.18653/v1/S19-2215 NTUA - ISL ab at S em E val-2019 task 9: M ining S uggestions in the wild . In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 1224--1230, Minneapolis, Minnesota, USA. Association for Computational Linguistics

  22. [22]

    Rashmi Prasad, Eleni Miltsakaki, Nikhil Dinesh, Alan Lee, and Aravind Joshi. 2003. https://www.seas.upenn.edu/ pdtb/PDTBAPI/pdtb-annotation-manual.pdf P enn D iscourse T reebank V ersion 2.0 A nnotation M anual

  23. [23]

    Chloe Shaw and Alexa Hepburn. 2013. https://doi.org/10.1080/08351813.2013.839095 Managing the M oral I mplications of A dvice in I nformal I nteraction . Research on Language and Social Interaction, 46(4):344--362

  24. [24]

    Carlota S. Smith. 2003. https://doi.org/10.1017/CBO9780511615108 Modes of Discourse: The Local Structure of Texts . Cambridge Studies in Linguistics. Cambridge University Press

  25. [25]

    Pontus Stenetorp, Sampo Pyysalo, Goran Topi \'c , Tomoko Ohta, Sophia Ananiadou, and Jun ' ichi Tsujii. 2012. https://www.aclweb.org/anthology/E12-2021 BRAT : a W eb-based T ool for NLP - A ssisted T ext A nnotation . In Proceedings of the Demonstrations at the 13th Conference of the E uropean Chapter of the Association for Computational Linguistics , pag...

  26. [26]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. https://papers.nips.cc/paper/7181-attention-is-all-you-need Attention is A ll you N eed . In Advances in Neural Information Processing Systems 30, pages 5998--6008. Curran Associates, Inc

  27. [27]

    Jesse Vig. 2019. https://doi.org/10.18653/v1/P19-3007 A M ultiscale V isualization of A ttention in the T ransformer M odel . In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 37--42, Florence, Italy. Association for Computational Linguistics

  28. [28]

    Yizhong Wang, Sujian Li, and Jingfeng Yang. 2018. https://doi.org/10.18653/v1/D18-1116 Toward F ast and A ccurate N eural D iscourse S egmentation . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 962--967, Brussels, Belgium. Association for Computational Linguistics

  29. [29]

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R'emi Louf, Morgan Funtowicz, and Jamie Brew. 2019. https://arxiv.org/abs/1910.03771 H uggingface's T ransformers: S tate-of-the-art N atural L anguage P rocessing . Computing Research Repository, arXiv:1910.03771

  30. [30]

    Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. http://papers.nips.cc/paper/8812-xlnet-generalized-autoregressive-pretraining-for-language-understanding XLNet : G eneralized A utoregressive P retraining for L anguage U nderstanding . In Advances in N eural I nformation P rocessing S ystems , pages 5754--5764

  31. [31]

    Rowan Zellers, Ari Holtzman, Elizabeth Clark, Lianhui Qin, Ali Farhadi, and Yejin Choi. 2020. https://arxiv.org/abs/2004.03607 Evaluating M achines by their R eal- W orld L anguage U se . Computing Research Repository, arXiv:2004.03607