pith. sign in

arxiv: 2604.19699 · v3 · submitted 2026-04-21 · 💻 cs.CL · cs.CY

Epistemic orientation in parliamentary discourse is associated with deliberative democracy

Pith reviewed 2026-05-10 02:38 UTC · model grok-4.3

classification 💻 cs.CL cs.CY
keywords epistemic orientationparliamentary discoursedeliberative democracyevidence-based reasoninglarge language modelspolitical speech analysisgovernance indicators
0
0 comments X p. Extension

The pith

Parliamentary speeches favoring evidence over intuition track higher deliberative democracy across countries and decades.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a measure called the Evidence-Minus-Intuition score to quantify whether political speech relies on verifiable information or on subjective beliefs. It applies this score to 15 million parliamentary speech segments from seven countries spanning 1946 to 2025. The central finding is a consistent positive association between higher EMI scores and indices of deliberative democracy within each country over time, as well as with the transparency and predictable enforcement of laws. A sympathetic reader would care because the result suggests that the epistemic style of elite discourse is not merely stylistic but tied to measurable qualities of democratic governance.

Core claim

Using large language models to rate speech segments for evidence-based versus intuition-based reasoning, the authors derive an EMI score for each segment and aggregate it at the country-year level. They report that EMI is positively associated with deliberative democracy in both contemporaneous and lagged within-country analyses, and that EMI also correlates positively with the governance dimension of transparent and predictable law implementation.

What carries the argument

The Evidence-Minus-Intuition (EMI) score, obtained from LLM ratings of evidence versus intuition in speech segments combined with embedding-based semantic similarity to quantify epistemic orientation.

If this is right

  • Higher average EMI in a country's parliament in one year predicts higher deliberative democracy scores in subsequent years.
  • The association appears for the specific governance sub-component of transparent and predictable laws.
  • The pattern holds across multiple countries and remains stable when speech is examined at the level of individual segments.
  • Temporal changes in parliamentary discourse style can be tracked quantitatively over eight decades.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the association is causal, interventions that shift parliamentary language toward evidence could produce measurable improvements in democracy scores.
  • The method could be extended to other text corpora such as social media or court rulings to test whether the same epistemic pattern appears outside formal legislatures.
  • Countries that already score high on deliberative democracy may sustain those scores partly by maintaining higher EMI in ongoing debate.

Load-bearing premise

That automated LLM ratings of evidence versus intuition in political speech produce a valid and unbiased measure of epistemic orientation that is not driven by topic, speaker style, or model artifacts.

What would settle it

A replication that rates the same speech segments with human coders and finds no association between those human EMI scores and deliberative democracy indices.

read the original abstract

The pursuit of truth is central to democratic deliberation and governance, yet political discourse reflects varying epistemic orientations, ranging from evidence-based reasoning grounded in verifiable information to intuition-based reasoning rooted in beliefs and subjective interpretation. We introduce a scalable approach to measure epistemic orientation using the Evidence--Minus--Intuition (EMI) score, derived from large language model (LLM) ratings and embedding-based semantic similarity. Applying this approach to 15 million parliamentary speech segments spanning 1946 to 2025 across seven countries, we examine temporal patterns in discourse and its association with deliberative democracy and governance. We find that EMI is positively associated with deliberative democracy within countries over time, with consistent relationships in both contemporaneous and lagged analyses. EMI is also positively associated with the transparency and predictable implementation of laws as a dimension of governance. These findings suggest that the epistemic nature of political discourse is crucial for both the quality of democracy and governance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces the Evidence-Minus-Intuition (EMI) score, derived from LLM ratings of evidence versus intuition in speech segments combined with embedding-based semantic similarity, as a scalable measure of epistemic orientation. Applying this to 15 million parliamentary speech segments across seven countries from 1946 to 2025, the authors report positive within-country temporal associations between EMI and indices of deliberative democracy (including in lagged specifications) as well as with the transparency and predictable implementation of laws as a governance dimension.

Significance. If the EMI score proves to be a valid, stable, and unconfounded proxy for epistemic orientation, the work would deliver large-scale empirical evidence linking evidence-based parliamentary discourse to democratic quality and governance outcomes over decades and across countries. The scale (15M segments), multi-country coverage, long temporal window, and use of both contemporaneous and lagged within-country designs are notable strengths that could advance computational approaches to political discourse analysis.

major comments (3)
  1. [Methods (EMI score construction)] The manuscript provides no reported validation of the LLM-based EMI ratings against human judgments, inter-rater reliability metrics, or cross-model robustness checks. This is load-bearing because LLM-as-judge outputs are known to be sensitive to prompt wording, model choice, and training-data priors on what constitutes 'evidence,' directly affecting interpretation of the positive associations with deliberative democracy and rule-of-law measures.
  2. [Results (temporal and lagged analyses)] No topic-fixed effects, speaker-style controls, or checks for topic-dependent language use (e.g., budget debates vs. moral-issue debates) are described in the association analyses. Parliamentary speech topics correlate with both verifiable claims and democratic indices, raising the possibility that the reported within-country temporal correlations are partly artifactual rather than reflective of epistemic orientation.
  3. [Abstract and Results] The abstract and methods summary supply no information on statistical controls for confounders, robustness to alternative embedding models, or falsification tests (e.g., placebo outcomes). These omissions leave the central claim that EMI is 'positively associated' vulnerable to alternative explanations.
minor comments (2)
  1. [Data description] Clarify the exact data cutoff for the 1946-2025 span and whether any segments post-2023 rely on imputation or different collection methods.
  2. [Methods] The notation for the EMI score (difference of ratings plus embedding similarity) should be formalized with an explicit equation to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which highlight important areas for strengthening the validity and robustness of our EMI score and its reported associations. We address each major comment point by point below, with clear indications of revisions to be incorporated in the next version of the manuscript.

read point-by-point responses
  1. Referee: The manuscript provides no reported validation of the LLM-based EMI ratings against human judgments, inter-rater reliability metrics, or cross-model robustness checks. This is load-bearing because LLM-as-judge outputs are known to be sensitive to prompt wording, model choice, and training-data priors on what constitutes 'evidence,' directly affecting interpretation of the positive associations with deliberative democracy and rule-of-law measures.

    Authors: We agree that explicit validation is essential for interpreting the EMI score, particularly given documented sensitivities of LLM judgments. Although the initial submission did not include these details, we have conducted a post-submission human validation on a stratified sample of 500 speech segments independently rated by two expert coders, yielding Cohen's kappa of 0.68 for evidence vs. intuition classification. We will also add cross-model robustness checks using GPT-4o and an open-source model (Llama-3-70B). These results, along with prompt sensitivity analyses, will be added to a new subsection in the Methods section of the revised manuscript, with discussion of how they support the main findings. revision: yes

  2. Referee: No topic-fixed effects, speaker-style controls, or checks for topic-dependent language use (e.g., budget debates vs. moral-issue debates) are described in the association analyses. Parliamentary speech topics correlate with both verifiable claims and democratic indices, raising the possibility that the reported within-country temporal correlations are partly artifactual rather than reflective of epistemic orientation.

    Authors: The referee correctly identifies a potential source of confounding, as topic composition may correlate with both EMI and the outcome measures. Our primary specifications rely on within-country temporal variation with year fixed effects and lagged EMI to reduce reverse causality and stable confounders, but we did not explicitly model topic dependence. In the revision, we will incorporate topic-fixed effects using LDA-derived topics from the full 15M-segment corpus and add supplementary analyses with speaker fixed effects (where speaker identifiers are available). These additions will be reported in the Results section to demonstrate that the associations hold after accounting for topic and speaker variation. revision: yes

  3. Referee: The abstract and methods summary supply no information on statistical controls for confounders, robustness to alternative embedding models, or falsification tests (e.g., placebo outcomes). These omissions leave the central claim that EMI is 'positively associated' vulnerable to alternative explanations.

    Authors: We acknowledge that the original abstract and methods did not sufficiently detail these elements, leaving the associations open to alternative interpretations. In the revised manuscript, we will update the abstract to reference key robustness features and expand the Results section with: (i) additional controls for time-varying confounders such as GDP per capita and government ideology; (ii) robustness checks using alternative embedding models (e.g., different sentence-transformer variants); and (iii) falsification tests with placebo outcomes (e.g., unrelated governance indicators like infrastructure spending). These will be presented in new tables, with the abstract revised accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity; EMI measurement independent of outcome variables

full rationale

The paper constructs the EMI score from LLM ratings of evidence vs. intuition plus embedding similarity on 15M parliamentary segments, then reports empirical within-country temporal associations with external deliberative democracy indices and rule-of-law measures. No equations or steps reduce the reported associations to a fitted parameter, self-referential definition, or self-citation chain; the measurement pipeline operates on speech text alone and does not incorporate the target governance variables. This is a standard independent measurement followed by correlation analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.0 · 5457 in / 1184 out tokens · 33656 ms · 2026-05-10T02:38:13.750399+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 6 canonical work pages · 2 internal anchors

  1. [1]

    Deliberate Ignorance: Choosing Not to Know

    Lewandowsky, S.: Willful construction of ignorance: A tale of two ontologies. Deliberate Ignorance: Choosing Not to Know. The MIT Press, Cambridge, Massachusetts (2020)

  2. [2]

    Academy of Management Annals17(2), 655–683 (2023)

    Cooper, B., Cohen, T.R., Huppert, E., Levine, E.E., Fleeson, W.: Honest behav- ior: Truth-seeking, belief-speaking, and fostering understanding of the truth in others. Academy of Management Annals17(2), 655–683 (2023)

  3. [3]

    Nature Human Behaviour9, 1122–1133 (2024)

    Aroyehun, S.T., Simchon, A., Carrella, F., Lasser, J., Lewandowsky, S., Gar- cia, D.: Computational analysis of us congressional speeches reveals a shift from evidence to intuition. Nature Human Behaviour9, 1122–1133 (2024)

  4. [4]

    Nature Human Behaviour, 1–12 (2023)

    Lasser, J., Aroyehun, S.T., Carrella, F., Simchon, A., Garcia, D., Lewandowsky, S.:Fromalternativeconceptionsofhonestytoalternativefactsincommunications by us politicians. Nature Human Behaviour, 1–12 (2023)

  5. [5]

    Nature Communications16(1), 1409 (2025) https://doi.org/10.1038/ s41467-025-56753-6

    Carrella, F., Aroyehun, S.T., Lasser, J., Simchon, A., Garcia, D., Lewandowsky, S.: Different honesty conceptions align across US politicians’ tweets and pub- lic replies. Nature Communications16(1), 1409 (2025) https://doi.org/10.1038/ s41467-025-56753-6

  6. [6]

    Proceedings of the National Academy of Sciences121(34), 2308950121 (2024)

    Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C.E., Van Bavel, J.J.: Gpt is an effective tool for multilingual psychological text anal- ysis. Proceedings of the National Academy of Sciences121(34), 2308950121 (2024)

  7. [7]

    The Review of Financial Studies, 012 (2025) https://doi.org/10.1093/ rfs/hhaf012

    Jha, M., Liu, H., Manela, A.: Does finance benefit society? a language embedding approach. The Review of Financial Studies, 012 (2025) https://doi.org/10.1093/ rfs/hhaf012

  8. [8]

    https://doi.org/ 10.23696/vdemds26

    Coppedge, M., Gerring, J., Knutsen, C.H., Lindberg, S.I., Teorell, J., Altman, D., Angiolillo, F., Bernhard, M., Cornell, A., Fish, M.S., Fox, L., Gastaldi, L., Gjerløw, H., Glynn, A., Good God, A., Hicken, A., Kinzelbach, K., Krusell, J., Marquardt, K.L., McMann, K., Mechkova, V., Medzihorsky, J., Neundorf, A., Paxton, P., Pemstein, D., Pernes, J., Römer...

  9. [9]

    Political Analysis16(3), 303–323 (2008)

    Grant, J.T., Kelly, N.J.: Legislative productivity of the us congress, 1789–2004. Political Analysis16(3), 303–323 (2008)

  10. [10]

    exploring model label variation

    Plaza-del-Arco,F.M.,Nozza,D.,Hovy,D.:Wisdomofinstruction-tunedlanguage model crowds. exploring model label variation. In: Proceedings of the 3rd Work- shop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 25 2024, pp. 19–30 (2024)

  11. [11]

    American Political Science Review118(1), 345–362 (2024)

    Niemeyer, S., Veri, F., Dryzek, J.S., Bächtiger, A.: How deliberation happens: enabling deliberative reason. American Political Science Review118(1), 345–362 (2024)

  12. [12]

    Stanford libraries (2018)

    Gentzkow, M., Shapiro, J.M., Taddy, M.: Congressional record for the 43rd-114th congresses: Parsed speeches and phrase counts. Stanford libraries (2018). https: //data.stanford.edu/congress_text

  13. [13]

    Richter, F., Koch, P., Franke, O., Kraus, J., Warode, L., Kuruc, F., Heine, S., Schöps, K.: Open discourse: towards the first fully comprehensive and annotated corpus of the parliamentary protocols of the german bundestag (2023)

  14. [14]

    In: Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N

    Frasnelli, V., Palmero Aprosio, A.: There’s something new about the Ital- ian parliament: The IPSA corpus. In: Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING2024),pp.16037–16046.ELRAandICCL,Torino,...

  15. [15]

    In: Fišer, D., Eskevich, M., Jong, F

    Steingrímsson, S., Barkarson, S., Örnólfsson, G.T.: IGC-parl: Icelandic cor- pus of parliamentary proceedings. In: Fišer, D., Eskevich, M., Jong, F. (eds.) Proceedings of the Second ParlaCLARIN Workshop, pp. 11–

  16. [16]

    https://aclanthology.org/2020.parlaclarin-1.3/

    European Language Resources Association, Marseille, France (2020). https://aclanthology.org/2020.parlaclarin-1.3/

  17. [17]

    In: Fišer, D., Eskevich, M., Jong, F

    Ogrodniczuk, M., Nitoń, B.: New developments in the Polish parliamentary cor- pus. In: Fišer, D., Eskevich, M., Jong, F. (eds.) Proceedings of the Second ParlaCLARIN Workshop, pp. 1–4. European Language Resources Association, Marseille, France (2020).https://aclanthology.org/2020.parlaclarin-1.1/

  18. [18]

    Language Resources and Evaluation59, 3765–3797 (2024)

    Yazar, T., Kutlu, M., Bayirli, I.K.: Turkronicles: diachronic resources for the fast evolving turkish language. Language Resources and Evaluation59, 3765–3797 (2024)

  19. [19]

    The Llama 3 Herd of Models

    Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Vaughan, A., et al.: The llama 3 herd of models. arXiv preprint arXiv:2407.21783 (2024)

  20. [20]

    Language Resources and Evaluation57(1), 415–448 (2023) https://doi.org/10.1007/s10579-021-09574-0 26

    Erjavec, T., Ogrodniczuk, M., Osenova, P., Ljubešić, N., Simov, K., Pančur, A., Rudolf, M., Kopp, M., Barkarson, S., Steingrímsson, S., Çöltekin, Ç., Does, J., Depuydt, K., Agnoloni, T., Venturi, G., Pérez, M.C., Macedo, L.D., Navarretta, C., Luxardo, G., Coole, M., Rayson, P., Morkevičius, V., Krilavičius, T., Darğis, R., Ring, O., Heusden, R., Marx, M.,...

  21. [21]

    Qwen2 Technical Report

    Yang, A., Yang, B., Hui, B., Zheng, B., Yu, B., Zhou, C., Li, C., Li, C., Liu, D., Huang, F., et al.: Qwen2 technical report. eprint arXiv: 2407.10671 (2024)

  22. [22]

    https://arxiv.org/abs/2509.14233 (2025)

    Hernández-Cano, A., Hägele, A., Huang, A.H., Romanou, A., Solergibert, A.-J., Pasztor, B., Messmer, B., Garbaya, D., Ďurech, E.F., Hakimi, I., Giraldo, J.G., Ismayilzada, M., Foroutan, N., Moalla, S., Chen, T., Sabolčec, V., Xu, Y., Aerni, M., AlKhamissi, B., Marinas, I.A., Amani, M.H., Ansaripour, M., Badanin, I., Benoit, H., Boros, E., Browning, N., Bös...

  23. [23]

    Nature Computational Science4(1), 2–3 (2024)

    Palmer, A., Smith, N.A., Spirling, A.: Using proprietary language models in aca- demic research requires explicit justification. Nature Computational Science4(1), 2–3 (2024)

  24. [24]

    In: Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles (2023)

    Kwon, W., Li, Z., Zhuang, S., Sheng, Y., Zheng, L., Yu, C.H., Gonzalez, J.E., Zhang, H., Stoica, I.: Efficient memory management for large language model serving with pagedattention. In: Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles (2023)

  25. [25]

    In: Proceedings of the 2024 Con- ference on Empirical Methods in Natural Language Processing: Industry Track, pp

    Zhang, X., Zhang, Y., Long, D., Xie, W., Dai, Z., Tang, J., Lin, H., Yang, B., Xie, P., Huang, F.,et al.: mgte: Generalized long-context text representation and reranking models for multilingual text retrieval. In: Proceedings of the 2024 Con- ference on Empirical Methods in Natural Language Processing: Industry Track, pp. 1393–1412 (2024)

  26. [26]

    Journal of Economic Surveys39(2), 631–671 (2025) https://doi.org/10.1111/joes.12618 https://onlinelibrary.wiley.com/doi/pdf/10.1111/joes.12618 27

    Bolt, J., Zanden, J.L.: Maddison-style estimates of the evolution of the world economy: A new 2023 update. Journal of Economic Surveys39(2), 631–671 (2025) https://doi.org/10.1111/joes.12618 https://onlinelibrary.wiley.com/doi/pdf/10.1111/joes.12618 27