pith. sign in

arxiv: 2605.22641 · v2 · pith:C6UC6JN4new · submitted 2026-05-21 · 💻 cs.CL · cs.AI· cs.LG

More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts

Pith reviewed 2026-05-25 06:06 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG
keywords Schwartz valuespolitical textvalue detectionretrieval-augmented generationcontext effectsDeBERTalarge language models
0
0 comments X

The pith

Retrieved moral knowledge improves Schwartz value detection in political texts more consistently than added context or larger models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether sentence-level detection of Schwartz values in political writing benefits most from longer input contexts, bigger models, or explicit retrieval of moral knowledge. Experiments compare sentence, window, and full-document inputs; no-retrieval versus retrieval-augmented setups with a curated moral knowledge base; supervised DeBERTa encoders; and zero-shot LLMs ranging from 12B to 123B parameters. Full-document context raises macro-F1 by 3.8-4.8 points for the encoders but helps the LLMs less reliably, while early-fusion retrieval of moral knowledge raises performance across every model family and context length tested. Scaling model size produces no guaranteed gains, and early fusion beats the tested late-fusion and cross-attention variants. The work concludes that context, knowledge, and model family must be evaluated together rather than assuming longer inputs or bigger models are always superior.

Core claim

The central claim is that retrieved moral knowledge under early fusion improves macro-F1 for every tested model family and every context condition, while full-document context helps supervised DeBERTa encoders by 3.8-4.8 points but does not help zero-shot LLMs consistently, and scaling from DeBERTa-v3-base to large or from 12B to larger LLMs does not guarantee gains. Simple early fusion outperforms late-fusion and cross-attention RAG variants. Per-value results show the largest lifts for socially situated or conceptually neighboring values.

What carries the argument

Early fusion of retrieved entries from a curated moral knowledge base with the input text for sentence-level Schwartz value classification.

If this is right

  • Full-document context improves supervised DeBERTa encoders by 3.8-4.8 macro-F1 points over sentence-only input.
  • Retrieved moral knowledge raises performance for every model family and context condition under early fusion.
  • Scaling model size from DeBERTa-v3-base to large or from 12B to larger LLMs does not guarantee accuracy gains.
  • Context and retrieval help most for socially situated or conceptually confusable Schwartz values.
  • Simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Political value detection pipelines may need domain-specific moral knowledge bases rather than generic retrieval.
  • The joint-evaluation recommendation could apply to other implicit classification tasks where neighboring categories are easily confused.
  • Systems that treat context length, knowledge retrieval, and model choice as independent choices may underperform those that optimize them together.

Load-bearing premise

The curated moral knowledge base supplies accurate, non-confounding distinctions among neighboring Schwartz values that transfer to the political texts in the evaluation set.

What would settle it

An independent test set of political sentences in which early-fusion retrieval of the moral knowledge base produces no macro-F1 gain or produces a loss for all model families and context lengths.

Figures

Figures reproduced from arXiv: 2605.22641 by Paolo Rosso, V\'ictor Yeste.

Figure 1
Figure 1. Figure 1: Encoder-side RAG fusion ablation. All RAG variants use the same retrieved KB chunks; only the fusion [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Experiment pipeline from a fixed target sen [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Compact orientation map of the refined Schwartz 19-value taxonomy used as the label space. Dashed [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Zero-shot LLM prompt template. The optional external-knowledge block is included only for RAG [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
read the original abstract

Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained distinctions between neighboring values. We study when context and explicit moral knowledge help sentence-level value detection. Using the ValuesML/Touch\'e ValueEval format, we compare sentence, window, and full-document inputs; no-RAG and retrieval-augmented settings with a curated moral knowledge base; supervised DeBERTa-v3-base/large encoders; and zero-shot LLMs from 12B to 123B parameters. The results show that more context is not uniformly better: full-document context improves supervised DeBERTa encoders by 3.8-4.8 macro-F1 points over sentence-only input, but does not consistently help zero-shot LLMs. Retrieved moral knowledge is more consistently useful in matched comparisons, improving each tested model family and context condition under early fusion. However, scaling from DeBERTa-v3-base to large and from 12B to larger LLMs does not guarantee gains, and simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders. Per-value analyses show that context and retrieval help most for socially situated or conceptually confusable values. These findings suggest that value-sensitive NLP should evaluate context, knowledge, and model family jointly rather than treating longer inputs or larger models as universal improvements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports a comparative empirical study of Schwartz value detection in political texts (ValuesML/Touché ValueEval data). It evaluates sentence vs. window vs. full-document inputs, no-RAG vs. RAG with a curated moral knowledge base under early/late/cross-attention fusion, supervised DeBERTa-v3 encoders, and zero-shot LLMs (12B–123B). Headline results are that full-document context boosts encoders by 3.8–4.8 macro-F1 but does not reliably help LLMs, that early-fusion RAG improves every model family and context condition, and that model scaling does not guarantee gains; per-value breakdowns indicate larger benefits for confusable or socially situated values.

Significance. If the matched comparisons and per-condition breakdowns are robust, the work supplies concrete evidence that explicit moral-knowledge retrieval can be more consistently beneficial than longer context or larger models for this task, supporting the broader recommendation to evaluate context, knowledge, and model family jointly rather than assuming monotonic improvements from scale or length.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (results): the directional claims rest on macro-F1 deltas of 3.8–4.8 points and statements that RAG “improves each tested model family,” yet the provided text supplies no information on statistical testing, number of runs, variance, or data-split protocol; these details are load-bearing for interpreting whether the observed improvements are reliable or could be due to split-specific effects.
  2. [§3 and §5] §3 (experimental setup) and §5 (per-value analysis): the central claim that the curated knowledge base supplies non-confounding distinctions among neighboring Schwartz values is tested only indirectly via downstream transfer; an explicit validation (e.g., inter-annotator agreement on the knowledge base itself or ablation of individual value definitions) would strengthen the interpretation that gains are attributable to accurate moral distinctions rather than incidental lexical cues.
minor comments (2)
  1. Table captions and result figures should explicitly state the exact number of Schwartz values, the macro-F1 aggregation method, and whether the reported deltas are over the same test set across all conditions.
  2. [Abstract] The abstract’s phrasing “simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders” would benefit from a one-sentence clarification of what “outperforms” means (absolute F1, rank order, or statistical test).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive feedback. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (results): the directional claims rest on macro-F1 deltas of 3.8–4.8 points and statements that RAG “improves each tested model family,” yet the provided text supplies no information on statistical testing, number of runs, variance, or data-split protocol; these details are load-bearing for interpreting whether the observed improvements are reliable or could be due to split-specific effects.

    Authors: We agree that reporting statistical details is necessary to support the reliability of the reported deltas. In the revised manuscript we will add the data-split protocol, the number of runs performed, standard deviations across runs, and statistical significance tests (e.g., bootstrap or paired permutation tests) for the key macro-F1 improvements. These additions will appear in §4 and will be referenced concisely in the abstract. revision: yes

  2. Referee: [§3 and §5] §3 (experimental setup) and §5 (per-value analysis): the central claim that the curated knowledge base supplies non-confounding distinctions among neighboring Schwartz values is tested only indirectly via downstream transfer; an explicit validation (e.g., inter-annotator agreement on the knowledge base itself or ablation of individual value definitions) would strengthen the interpretation that gains are attributable to accurate moral distinctions rather than incidental lexical cues.

    Authors: The knowledge base was derived directly from the canonical Schwartz value definitions and supporting literature to emphasize theoretically motivated distinctions. While we concur that an explicit validation study would further strengthen attribution, the paper’s primary contribution is the controlled comparison of retrieval effects on the downstream task. The per-value results in §5 already show larger gains precisely on confusable values, which is consistent with the KB supplying useful distinctions rather than generic lexical cues. We will expand the curation description in §3 and, if space allows, include a limited ablation of value definitions; a full inter-annotator study on the KB itself lies outside the current experimental scope. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical comparisons are self-contained

full rationale

The paper reports a systematic empirical comparison of input conditions (sentence vs. window vs. document), retrieval settings (no-RAG vs. early-fusion RAG with a fixed curated knowledge base), and model families (DeBERTa encoders and zero-shot LLMs of varying scale). All headline results are measured outcomes on held-out evaluation data using standard macro-F1; no quantity is obtained by fitting a parameter to the target metric and then re-reporting that metric as a prediction, no derivation reduces to a self-citation, and no uniqueness theorem or ansatz is invoked. The experimental design directly tests the transfer of the external knowledge base rather than presupposing its correctness, so the reported improvements are falsifiable observations rather than definitional restatements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study is an empirical comparison that relies on an existing annotated dataset and standard supervised and zero-shot modeling practices; no new free parameters, axioms beyond domain norms, or invented entities are introduced.

axioms (1)
  • domain assumption The ValuesML/Touché ValueEval annotations constitute reliable ground truth for sentence-level Schwartz values.
    All reported comparisons rest on this dataset.

pith-pipeline@v0.9.0 · 5779 in / 1072 out tokens · 34439 ms · 2026-05-25T06:06:20.023442+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

  1. [1]

    Advances in Experimental Social Psychology , publisher =

    Universals in the Content and Structure of Values: Theoretical Advances and Empirical Tests in 20 Countries , editor =. Advances in Experimental Social Psychology , publisher =. 1992 , issn =. doi:10.1016/S0065-2601(08)60281-6 , author =

  2. [2]

    Schwartz, Shalom H. and Cieciuch, Jan and Vecchione, Michele and Davidov, Eldad and Fischer, Ronald and Beierlein, Constanze and Ramos, Alice and Verkasalo, Markku and Lonnqvist, Jan-Erik and Demirutku, Kursad and Dirilen-Gumus, Ozlem and Konty, Mark , title =. Journal of Personality and Social Psychology , volume =. 2012 , doi =

  3. [3]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , url =

    Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K\". Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , url =. Advances in Neural Information Processing Systems , editor =

  4. [4]

    Structure and Consistency in Public Opinion: the Role of Core Beliefs and Values , urldate =

    Stanley Feldman , journal =. Structure and Consistency in Public Opinion: the Role of Core Beliefs and Values , urldate =

  5. [5]

    American Journal of Political Science , volume =

    Goren, Paul , title =. American Journal of Political Science , volume =. doi:10.1111/j.1540-5907.2005.00161.x , year =

  6. [6]

    , title =

    Entman, Robert M. , title =. Journal of Communication , volume =. 1993 , doi =

  7. [7]

    , title =

    Chong, Dennis and Druckman, James N. , title =. Annual Review of Political Science , volume =. 2007 , doi =

  8. [8]

    Proceedings of the 37th International Conference on Machine Learning , pages =

    Retrieval Augmented Language Model Pre-Training , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , editor =

  9. [9]

    Language Models are Few-Shot Learners , url =

    Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...

  10. [10]

    Training language models to follow instructions with human feedback , url =

    Ouyang, Long and Wu, Jeffrey and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and Schulman, John and Hilton, Jacob and Kelton, Fraser and Miller, Luke and Simens, Maddie and Askell, Amanda and Welinder, Peter and Christiano, Paul F and Leike, Jan and Lowe,...

  11. [11]

    Pengcheng He and Jianfeng Gao and Weizhu Chen , booktitle=. De. 2023 , url=

  12. [12]

    , title =

    Graham, Jesse and Haidt, Jonathan and Nosek, Brian A. , title =. Journal of Personality and Social Psychology , volume =. 2009 , doi =

  13. [13]

    2026 , eprint=

    Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum , author=. 2026 , eprint=

  14. [14]

    2026 , eprint=

    Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration , author=. 2026 , eprint=

  15. [15]

    2025 , isbn =

    Dong, Qian and Ai, Qingyao and Wang, Hongning and Liu, Yiding and Li, Haitao and Su, Weihang and Liu, Yiqun and Chua, Tat-Seng and Ma, Shaoping , title =. 2025 , isbn =. doi:10.1145/3696410.3714608 , booktitle =

  16. [16]

    Billion-Scale Similarity Search with GPUs , year=

    Johnson, Jeff and Douze, Matthijs and Jégou, Hervé , journal=. Billion-Scale Similarity Search with GPUs , year=

  17. [17]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library , url =

    Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and Kopf, Andreas and Yang, Edward and DeVito, Zachary and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu an...

  18. [18]

    Scikit-learn: Machine Learning in Python , journal =

    Fabian Pedregosa and Ga. Scikit-learn: Machine Learning in Python , journal =. 2011 , volume =

  19. [19]

    2025 , eprint=

    Gemma 3 Technical Report , author=. 2025 , eprint=

  20. [20]

    2025 , eprint=

    Qwen2.5 Technical Report , author=. 2025 , eprint=

  21. [21]

    Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) , editor =

    Kiesel, Johannes and Çöltekin, Çağrı and Heinrich, Maximilian and Fröbe, Maik and Alshomary, Milad and De Longueville, Bertrand and Erjavec, Tomaž and Handke, Nicolas and Kopp, Matyáš and Ljubešić, Nikola and Meden, Katja and Mirzakhmedova, Nailia and Morkevičius, Vaidas and Reitis-Münstermann, Theresa and Scharfbillig, Mario and Stefanovitch, Nicolas and...

  22. [22]

    Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) , editor =

    Yeste, Víctor and Coll-Ardanuy, Mariona and Rosso, Paolo , title =. Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024) , editor =. 2024 , publisher =

  23. [23]

    Mining the uncertainty patterns of humans and models in the annotation of moral foundations and human values

    Falk, Neele and Lapesa, Gabriella. Mining the uncertainty patterns of humans and models in the annotation of moral foundations and human values. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1116

  24. [24]

    Identifying the Human Values behind Arguments

    Kiesel, Johannes and Alshomary, Milad and Handke, Nicolas and Cai, Xiaoni and Wachsmuth, Henning and Stein, Benno. Identifying the Human Values behind Arguments. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.306

  25. [25]

    S em E val-2023 Task 4: V alue E val: Identification of Human Values Behind Arguments

    Kiesel, Johannes and Alshomary, Milad and Mirzakhmedova, Nailia and Heinrich, Maximilian and Handke, Nicolas and Wachsmuth, Henning and Stein, Benno. S em E val-2023 Task 4: V alue E val: Identification of Human Values Behind Arguments. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.313

  26. [26]

    The Touch \'e 23- V alue E val Dataset for Identifying Human Values behind Arguments

    Mirzakhmedova, Nailia and Kiesel, Johannes and Alshomary, Milad and Heinrich, Maximilian and Handke, Nicolas and Cai, Xiaoni and Barriere, Valentin and Dastgheib, Doratossadat and Ghahroodi, Omid and Sadraei Javaheri, Mohammad Ali and Asgari, Ehsaneddin and Kawaletz, Lea and Wachsmuth, Henning and Stein, Benno. The Touch \'e 23- V alue E val Dataset for I...

  27. [27]

    Dense Passage Retrieval for Open-Domain Question Answering

    Karpukhin, Vladimir and Oguz, Barlas and Min, Sewon and Lewis, Patrick and Wu, Ledell and Edunov, Sergey and Chen, Danqi and Yih, Wen-tau. Dense Passage Retrieval for Open-Domain Question Answering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.550

  28. [28]

    The Social Impact of Natural Language Processing

    Hovy, Dirk and Spruit, Shannon L. The Social Impact of Natural Language Processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2016. doi:10.18653/v1/P16-2096

  29. [29]

    Language (Technology) is Power: A Critical Survey of ``Bias'' in NLP

    Blodgett, Su Lin and Barocas, Solon and Daum \'e III, Hal and Wallach, Hanna. Language (Technology) is Power: A Critical Survey of ``Bias'' in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.485

  30. [30]

    BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

  31. [31]

    Epicurus at S em E val-2023 Task 4: Improving Prediction of Human Values behind Arguments by Leveraging Their Definitions

    Fang, Christian and Fang, Qixiang and Nguyen, Dong. Epicurus at S em E val-2023 Task 4: Improving Prediction of Human Values behind Arguments by Leveraging Their Definitions. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.31

  32. [32]

    Hitachi at S em E val-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values

    Tsunokake, Masaya and Yamaguchi, Atsuki and Koreeda, Yuta and Ozaki, Hiroaki and Sogawa, Yasuhiro. Hitachi at S em E val-2023 Task 4: Exploring Various Task Formulations Reveals the Importance of Description Texts on Human Values. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.240

  33. [33]

    Akram and Chy, Abu Nowshed

    Aziz, Abdul and Hossain, Md. Akram and Chy, Abu Nowshed. CSECU - DSG at S em E val-2023 Task 4: Fine-tuning D e BERT a Transformer Model with Cross-fold Training and Multi-sample Dropout for Human Values Identification. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.274

  34. [34]

    Tenzin-Gyatso at S em E val-2023 Task 4: Identifying Human Values behind Arguments Using D e BERT a

    Kandru, Pavan and Singh, Bhavyajeet and Maity, Ankita and Aditya Hari, Kancharla and Varma, Vasudeva. Tenzin-Gyatso at S em E val-2023 Task 4: Identifying Human Values behind Arguments Using D e BERT a. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.284

  35. [35]

    SUTNLP at S em E val-2023 Task 4: LG -Transformer for Human Value Detection

    Hematian Hemati, Hamed and Alavian, Sayed Hesam and Sameti, Hossein and Beigy, Hamid. SUTNLP at S em E val-2023 Task 4: LG -Transformer for Human Value Detection. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.46

  36. [36]

    Andronicus of Rhodes at S em E val-2023 Task 4: Transformer-Based Human Value Detection Using Four Different Neural Network Architectures

    Papadopoulos, Georgios and Kokol, Marko and Dagioglou, Maria and Petasis, Georgios. Andronicus of Rhodes at S em E val-2023 Task 4: Transformer-Based Human Value Detection Using Four Different Neural Network Architectures. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.75

  37. [37]

    Noam C homsky at S em E val-2023 Task 4: Hierarchical Similarity-aware Model for Human Value Detection

    Honda, Sumire and Wilharm, Sebastian. Noam C homsky at S em E val-2023 Task 4: Hierarchical Similarity-aware Model for Human Value Detection. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 2023. doi:10.18653/v1/2023.semeval-1.188

  38. [38]

    Sina at S em E val-2023 Task 4: A Class-Token Attention-based Model for Human Value Detection

    Ghahroodi, Omid and Sadraei, Mohammad Ali and Dastgheib, Doratossadat and Soleymani Baghshah, Mahdieh and Rohban, Mohammad Hossein and Rabiee, Hamid and Asgari, Ehsaneddin. Sina at S em E val-2023 Task 4: A Class-Token Attention-based Model for Human Value Detection. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). 202...

  39. [39]

    An Empirical Exploration of Moral Foundations Theory in Partisan News Sources

    Fulgoni, Dean and Carpenter, Jordan and Ungar, Lyle and Preo t iuc-Pietro, Daniel. An Empirical Exploration of Moral Foundations Theory in Partisan News Sources. Proceedings of the Tenth International Conference on Language Resources and Evaluation ( LREC '16). 2016

  40. [40]

    Classification of Moral Foundations in Microblog Political Discourse

    Johnson, Kristen and Goldwasser, Dan. Classification of Moral Foundations in Microblog Political Discourse. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. doi:10.18653/v1/P18-1067

  41. [41]

    Moral Foundations of Large Language Models

    Abdulhai, Marwa and Serapio-Garc \'i a, Gregory and Crepy, Clement and Valter, Daria and Canny, John and Jaques, Natasha. Moral Foundations of Large Language Models. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.982

  42. [42]

    Value FULCRA : Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Value

    Yao, Jing and Yi, Xiaoyuan and Gong, Yifan and Wang, Xiting and Xie, Xing. Value FULCRA : Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Value. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024. doi:10.1...

  43. [43]

    Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items

    Han, Jongwook and Choi, Dongmin and Song, Woojung and Lee, Eun-Ju and Jo, Yohan. Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.838

  44. [44]

    Beyond Single Models: Leveraging LLM Ensembles for Human Value Detection in Text

    Rodrigues, Diego Dimer and Recamonde-Mendoza, Mariana and Moreira, Viviane P. Beyond Single Models: Leveraging LLM Ensembles for Human Value Detection in Text. Proceedings of the 15th Brazilian Symposium in Information and Human Language Technology. 2024

  45. [45]

    Hierarchical Attention Networks for Document Classification

    Yang, Zichao and Yang, Diyi and Dyer, Chris and He, Xiaodong and Smola, Alex and Hovy, Eduard. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. doi:10.18653/v1/N16-1174

  46. [46]

    Multilingual Hierarchical Attention Networks for Document Classification

    Pappas, Nikolaos and Popescu-Belis, Andrei. Multilingual Hierarchical Attention Networks for Document Classification. Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2017

  47. [47]

    Sentence- BERT : Sentence Embeddings using S iamese BERT -Networks

    Reimers, Nils and Gurevych, Iryna. Sentence- BERT : Sentence Embeddings using S iamese BERT -Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1410

  48. [48]

    Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

    Izacard, Gautier and Grave, Edouard. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. doi:10.18653/v1/2021.eacl-main.74

  49. [49]

    Show Your Work: Improved Reporting of Experimental Results

    Dodge, Jesse and Gururangan, Suchin and Card, Dallas and Schwartz, Roy and Smith, Noah A. Show Your Work: Improved Reporting of Experimental Results. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. doi:10.18653/v1/D19-1224

  50. [50]

    The Hitchhiker ' s Guide to Testing Statistical Significance in Natural Language Processing

    Dror, Rotem and Baumer, Gili and Shlomov, Segev and Reichart, Roi. The Hitchhiker ' s Guide to Testing Statistical Significance in Natural Language Processing. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. doi:10.18653/v1/P18-1128

  51. [51]

    Transformers: State-of-the-Art Natural Language Processing

    Wolf, Thomas and Debut, Lysandre and Sanh, Victor and Chaumond, Julien and Delangue, Clement and Moi, Anthony and Cistac, Pierric and Rault, Tim and Louf, Remi and Funtowicz, Morgan and Davison, Joe and Shleifer, Sam and von Platen, Patrick and Ma, Clara and Jernite, Yacine and Plu, Julien and Xu, Canwen and Le Scao, Teven and Gugger, Sylvain and Drame, M...