ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

arxiv: 2512.04316 · v7 · submitted 2025-12-03 · 💻 cs.HC

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

Haoze Guo This is my paper

Pith reviewed 2026-05-17 01:42 UTC · model grok-4.3

classification 💻 cs.HC

keywords privacy policyconsent interfacelongitudinal auditweb measurementUI frictionpolicy churnconsent banneralignment score

0 comments p. Extension

The pith

Longitudinal audits show privacy policies keep churning while consent banners shift toward easier rejection and better policy alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ConsentDiff, a pipeline that takes monthly snapshots of websites to track how privacy policy text and consent user interfaces evolve together. It semantically aligns individual policy clauses across time to measure churn and combines DOM signals with screenshot cues to classify common UI patterns such as banner designs. A new weighted claim-UI alignment score then connects specific policy promises to observable interface features, enabling comparisons across time, regions, and site categories. Measurements indicate ongoing clause-level policy changes, a systematic reduction in higher-friction banner designs, and substantially higher alignment scores on sites where the reject option is visible and low-effort. This approach addresses the gap in understanding whether consent interfaces actually deliver on the commitments stated in policies.

Core claim

ConsentDiff provides a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by combining DOM signals with cues from screenshots. It introduces a weighted claim-UI alignment score that links common policy claims to observable predicates, supporting comparisons over time, regions, and verticals. The resulting measurements indicate continued policy churn, systematic changes to eliminate a higher-friction banner design, and significantly higher alignment where rejecting is visible and lower friction.

What carries the argument

The ConsentDiff pipeline, which performs monthly site snapshots, semantic clause alignment for policy churn tracking, and DOM-plus-screenshot classification of consent UI patterns to produce a weighted claim-UI alignment score.

Load-bearing premise

The pipeline's semantic alignment of policy clauses and classification of UI patterns from DOM and screenshots accurately capture real-world policy-UI relationships without substantial interpretation errors or sampling bias.

What would settle it

A manual audit of several hundred sites revealing that the computed alignment scores frequently mismatch human judgments of whether the displayed consent interface actually implements the specific claims found in the current policy text.

Figures

Figures reproduced from arXiv: 2512.04316 by Haoze Guo.

read the original abstract

Web privacy is experienced via two public artifacts: site utterances in policy texts, and the actions users are required to take during consent interfaces. In the extensive cross-section audits we've studied, there is a lack of longitudinal data detailing how these artifacts are changing together, and if interfaces are actually doing what they promise in policy. ConsentDiff provides that longitudinal view. We build a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by pulling together DOM signals with cues provided by screenshots. We introduce a novel weighted claim-UI alignment score, connecting common policy claims to observable predicates, and enabling comparisons over time, regions, and verticals. Our measurements suggest continued policy churn, systematic changes to eliminate a higher-friction banner design, and significantly higher alignment where rejecting is visible and lower friction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a monthly snapshot pipeline and weighted alignment score to track privacy policy and consent UI changes together, but the absence of accuracy metrics for the classifiers leaves the reported trends on shaky ground.

read the letter

The main thing to know is that this work sets up a reproducible way to snapshot sites monthly, semantically align policy clauses across versions, and score how well consent UIs match the claims in the text using DOM signals plus screenshots. The longitudinal angle and the weighted claim-UI score are the actual new pieces; prior cross-sectional audits did not track these elements moving in tandem over time or across regions and verticals. The measurements they surface—continued clause churn, a shift away from higher-friction banner designs, and stronger alignment when a reject option is visible and low-friction—give a concrete picture of how these public artifacts evolve in practice. That framing could be useful for anyone who needs to monitor compliance trends rather than just snapshot them once. The soft spot is validation. The pipeline description does not include accuracy numbers, error rates, or checks against manual labels for either the semantic alignment step or the UI pattern classifier. If those components misfire at even moderate rates and the errors correlate with site type or region, the headline patterns on churn and alignment differences become hard to interpret. This is a measurement paper, so the lack of that evidence is the central limitation rather than a minor omission. The work is aimed at HCI and privacy researchers who run or want to run ongoing audits; someone building a compliance dashboard or extending audit methods would get the most out of the pipeline if the code and validation details are released. It deserves peer review so referees can examine the methods section and ask for the missing accuracy analysis.

Referee Report

1 major / 2 minor

Summary. The paper introduces ConsentDiff, a reproducible pipeline for monthly website snapshots that semantically aligns privacy policy clauses to measure clause-level churn, classifies consent UI patterns by combining DOM signals with screenshot cues, and defines a novel weighted claim-UI alignment score linking policy claims to observable UI predicates. Measurements indicate continued policy churn, systematic removal of higher-friction banner designs, and significantly higher alignment scores in cases where rejection options are visible and friction is low.

Significance. If the pipeline components prove reliable, the work supplies valuable longitudinal empirical data on the co-evolution of privacy policy text and consent interfaces, enabling comparisons across time, regions, and verticals. The reproducible pipeline and the claim-UI alignment score are concrete strengths that support falsifiable follow-up studies and could inform regulatory audits of GDPR/CCPA-style consent mechanisms.

major comments (1)

[Pipeline and measurement sections] Pipeline and measurement sections: the abstract and methods description present the semantic alignment procedure and UI classifier as central to all reported trends, yet supply no accuracy metrics, error rates, inter-annotator agreement, or ground-truth validation for either component. Because the headline claims of continued churn, systematic banner changes, and 'significantly higher alignment' rest directly on these measurements, the absence of quantitative validation is load-bearing and must be addressed before the observational results can be interpreted with confidence.

minor comments (2)

[Abstract] Abstract: the phrase 'significantly higher alignment' should be accompanied by the statistical test, p-value threshold, and effect-size information used.
[Data collection] The manuscript would benefit from an explicit statement of the sampling frame (how sites and regions were selected) and any exclusion rules applied to snapshots.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for validation of the core pipeline components. We agree this is a substantive issue that must be addressed to strengthen the interpretability of the results and will incorporate the requested metrics in the revision.

read point-by-point responses

Referee: [Pipeline and measurement sections] Pipeline and measurement sections: the abstract and methods description present the semantic alignment procedure and UI classifier as central to all reported trends, yet supply no accuracy metrics, error rates, inter-annotator agreement, or ground-truth validation for either component. Because the headline claims of continued churn, systematic banner changes, and 'significantly higher alignment' rest directly on these measurements, the absence of quantitative validation is load-bearing and must be addressed before the observational results can be interpreted with confidence.

Authors: We acknowledge that the submitted manuscript does not report quantitative validation metrics (accuracy, error rates, inter-annotator agreement, or ground-truth comparisons) for the semantic alignment procedure or the UI classifier. This omission limits confidence in the downstream claims, as noted. In the revised manuscript we will add a new subsection under Methods that details: (1) a manually annotated ground-truth set of 200 policy clauses for semantic alignment, with reported precision/recall and inter-annotator agreement (Cohen’s kappa); (2) a held-out test set of 150 consent UIs with screenshot+DOM labels, reporting classification accuracy and confusion matrices; and (3) an explicit discussion of remaining error sources and their potential impact on the longitudinal trends. We will also release the validation annotations alongside the pipeline code to support reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical measurement pipeline

full rationale

The paper is an empirical measurement study that builds a reproducible pipeline to snapshot websites monthly, semantically align policy clauses for churn tracking, classify consent UI patterns from DOM signals and screenshots, and compute a novel weighted claim-UI alignment score linking policy claims to observable predicates. No derivation chain, equations, fitted parameters presented as predictions, self-definitional constructs, or load-bearing self-citations are present in the abstract or description. The central findings rely on the pipeline's outputs as direct measurements rather than reducing to inputs by construction. This is self-contained empirical work; the absence of validation metrics is a separate limitation on reliability, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are described in the abstract; the work relies on standard semantic alignment and DOM/screenshot classification techniques.

pith-pipeline@v0.9.0 · 5438 in / 1004 out tokens · 22770 ms · 2026-05-17T01:42:49.619797+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We build a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by pulling together DOM signals with cues provided by screenshots. We introduce a novel weighted claim-UI alignment score
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We pair policy claims with necessary UI predicates (e.g., default-off, visible “Reject all”, steps-to-reject≤ 2) to obtain an alignment score A∈[0,1]

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 3 internal anchors

[1]

Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, and Claudia Diaz. 2014. The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. InProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS). 674–689. doi:10.1145/2660267. 2660347

work page doi:10.1145/2660267 2014
[2]

Angrist and Jörn-Steffen Pischke

Joshua D. Angrist and Jörn-Steffen Pischke. 2009.Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press

work page 2009
[3]

Martin Degeling, Christine Utz, Christopher Lentzsch, Henry Hosseini, Florian Schaub, and Thorsten Holz. 2019. We Value Your Privacy... Now Take Some Cook- ies: Measuring the GDPR’s Impact on Web Privacy. InNetwork and Distributed System Security Symposium (NDSS). https://www.ndss-symposium.org/wp- content/uploads/2019/02/ndss2019_01A-3_Degeling_paper.pdf

work page 2019
[4]

Steven Englehardt and Arvind Narayanan. 2016. Online Tracking: A 1-Million- Site Measurement and Analysis. InNetwork and Distributed System Security Symposium (NDSS). https://webtransparency.cs.princeton.edu/webcensus/

work page 2016
[5]

European Data Protection Board. 2020. Guidelines 05/2020 on Consent under Regulation 2016/679. https://edpb.europa.eu/our-work-tools/our-documents/ guidelines/guidelines-052020-consent-under-regulation-2016679_en

work page 2020
[6]

Gray, Nataliia Bielova, Cristiana Santos, et al

Colin M. Gray, Nataliia Bielova, Cristiana Santos, et al . 2021. Dark Patterns and the Legal Requirements of Consent Banners. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM. https://www- sop.inria.fr/members/Nataliia.Bielova/papers/Gray-etal-21-CHI.pdf

work page 2021
[7]

Haoze Guo and Ziqi Wei. 2026. Behind the Feed: A Taxonomy of User-Facing Cues for Algorithmic Transparency in Social Media.arXiv preprint arXiv:2602.03121 (2026)

work page arXiv 2026
[8]

Haoze Guo and Ziqi Wei. 2026. Hidden-in-Plain-Text: A Benchmark for Social- Web Indirect Prompt Injection in RAG.arXiv preprint arXiv:2601.10923(2026)

work page arXiv 2026
[9]

Haoze Guo and Ziqi Wei. 2026. Temporal Drift in Privacy Recall: Users Misremem- ber From Verbatim Loss to Gist-Based Overexposure. arXiv:2509.16962 [cs.HC]

work page internal anchor Pith review Pith/arXiv arXiv 2026
[10]

Hamza Harkous, Kassem Fawaz, Reza Shokri, Bryan Ford, and Karl Aberer. 2018. Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning. In27th USENIX Security Symposium (USENIX Security). 531–548. https: //www.usenix.org/conference/usenixsecurity18/presentation/harkous

work page 2018
[11]

IAB Europe. 2020. Transparency & Consent Framework (TCF) v2.0: Policies and Specifications. https://iabeurope.eu/tcf-2-0/

work page 2020
[12]

Rebecca Killick, Paul Fearnhead, and Idris A. Eckley. 2012. Optimal Detection of Changepoints With a Linear Computational Cost.J. Amer. Statist. Assoc.107, 500 (2012), 1590–1598. doi:10.1080/01621459.2012.737745

work page doi:10.1080/01621459.2012.737745 2012
[13]

Adam Lerner, Anna Kornfeld Simpson, Tadayoshi Kohno, and Franziska Roesner

work page
[14]

InProceedings of the 2016 ACM Web Science Conference (WebSci)

Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016. InProceedings of the 2016 ACM Web Science Conference (WebSci). 237–246. doi:10.1145/2908131.2908165

work page doi:10.1145/2908131.2908165 1996
[15]

Levenshtein

Vladimir I. Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.Soviet Physics Doklady10 (1966), 707–710

work page 1966
[16]

Marco Lippi, Paolo Torroni, et al. 2019. CLAUDETTE: an Automated Detector of Potentially Unfair Clauses in Online Terms of Service.Artificial Intelligence and Law27, 2 (2019), 117–139. doi:10.1007/s10506-019-09243-2

work page doi:10.1007/s10506-019-09243-2 2019
[17]

Mathur, G

Arunesh Mathur, Gunes Acar, Michael J. Friedman, Elena Lucherini, Jonathan Mayer, Marshini Chetty, and Arvind Narayanan. 2019. Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites.Proceedings of the ACM on ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions CHI EA ’26, April 13–17, 2026, Barcelona, Sp...

work page doi:10.1145/3359183 2019
[18]

Célestin Matte, Nataliia Bielova, and Cristiana Santos. 2020. Do Cookie Banners Respect My Choice? Measuring Legal Compliance of Banners from IAB Europe’s Transparency and Consent Framework. In2020 IEEE Symposium on Security and Privacy (SP). IEEE, 791–809. doi:10.1109/SP40000.2020.00025

work page doi:10.1109/sp40000.2020.00025 2020
[19]

Midas Nouwens, Ilaria Liccardi, Michael Veale, David Karger, and Lalana Kagal

work page
[20]

URLhttp://dx.doi.org/10.1145/3313831.3376327

Dark Patterns after the GDPR: Scraping Consent Pop-ups and Demonstrat- ing Their Influence. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM. doi:10.1145/3313831.3376321

work page doi:10.1145/3313831.3376321 2020
[21]

Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Ko- rczyński, and Wouter Joosen. 2019. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. InNetwork and Distributed System Security Symposium (NDSS). https://tranco-list.eu

work page 2019
[22]

Data Programming: Creating Large Training Sets, Quickly

Alexander J. Ratner, Christopher M. De Sa, Sen Wu, Daniel Selsam, and Christo- pher Ré. 2017. Data Programming: Creating Large Training Sets, Quickly. In Advances in Neural Information Processing Systems (NeurIPS). https://arxiv.org/ abs/1605.07723

work page internal anchor Pith review Pith/arXiv arXiv 2017
[23]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 3982–3992. https://arxiv. org/abs/1908.10084

work page internal anchor Pith review Pith/arXiv arXiv 2019
[24]

Wooldridge

Jeffrey M. Wooldridge. 2010.Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press

work page 2010

[1] [1]

Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, and Claudia Diaz. 2014. The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. InProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS). 674–689. doi:10.1145/2660267. 2660347

work page doi:10.1145/2660267 2014

[2] [2]

Angrist and Jörn-Steffen Pischke

Joshua D. Angrist and Jörn-Steffen Pischke. 2009.Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press

work page 2009

[3] [3]

Martin Degeling, Christine Utz, Christopher Lentzsch, Henry Hosseini, Florian Schaub, and Thorsten Holz. 2019. We Value Your Privacy... Now Take Some Cook- ies: Measuring the GDPR’s Impact on Web Privacy. InNetwork and Distributed System Security Symposium (NDSS). https://www.ndss-symposium.org/wp- content/uploads/2019/02/ndss2019_01A-3_Degeling_paper.pdf

work page 2019

[4] [4]

Steven Englehardt and Arvind Narayanan. 2016. Online Tracking: A 1-Million- Site Measurement and Analysis. InNetwork and Distributed System Security Symposium (NDSS). https://webtransparency.cs.princeton.edu/webcensus/

work page 2016

[5] [5]

European Data Protection Board. 2020. Guidelines 05/2020 on Consent under Regulation 2016/679. https://edpb.europa.eu/our-work-tools/our-documents/ guidelines/guidelines-052020-consent-under-regulation-2016679_en

work page 2020

[6] [6]

Gray, Nataliia Bielova, Cristiana Santos, et al

Colin M. Gray, Nataliia Bielova, Cristiana Santos, et al . 2021. Dark Patterns and the Legal Requirements of Consent Banners. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM. https://www- sop.inria.fr/members/Nataliia.Bielova/papers/Gray-etal-21-CHI.pdf

work page 2021

[7] [7]

Haoze Guo and Ziqi Wei. 2026. Behind the Feed: A Taxonomy of User-Facing Cues for Algorithmic Transparency in Social Media.arXiv preprint arXiv:2602.03121 (2026)

work page arXiv 2026

[8] [8]

Haoze Guo and Ziqi Wei. 2026. Hidden-in-Plain-Text: A Benchmark for Social- Web Indirect Prompt Injection in RAG.arXiv preprint arXiv:2601.10923(2026)

work page arXiv 2026

[9] [9]

Haoze Guo and Ziqi Wei. 2026. Temporal Drift in Privacy Recall: Users Misremem- ber From Verbatim Loss to Gist-Based Overexposure. arXiv:2509.16962 [cs.HC]

work page internal anchor Pith review Pith/arXiv arXiv 2026

[10] [10]

Hamza Harkous, Kassem Fawaz, Reza Shokri, Bryan Ford, and Karl Aberer. 2018. Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning. In27th USENIX Security Symposium (USENIX Security). 531–548. https: //www.usenix.org/conference/usenixsecurity18/presentation/harkous

work page 2018

[11] [11]

IAB Europe. 2020. Transparency & Consent Framework (TCF) v2.0: Policies and Specifications. https://iabeurope.eu/tcf-2-0/

work page 2020

[12] [12]

Rebecca Killick, Paul Fearnhead, and Idris A. Eckley. 2012. Optimal Detection of Changepoints With a Linear Computational Cost.J. Amer. Statist. Assoc.107, 500 (2012), 1590–1598. doi:10.1080/01621459.2012.737745

work page doi:10.1080/01621459.2012.737745 2012

[13] [13]

Adam Lerner, Anna Kornfeld Simpson, Tadayoshi Kohno, and Franziska Roesner

work page

[14] [14]

InProceedings of the 2016 ACM Web Science Conference (WebSci)

Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016. InProceedings of the 2016 ACM Web Science Conference (WebSci). 237–246. doi:10.1145/2908131.2908165

work page doi:10.1145/2908131.2908165 1996

[15] [15]

Levenshtein

Vladimir I. Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.Soviet Physics Doklady10 (1966), 707–710

work page 1966

[16] [16]

Marco Lippi, Paolo Torroni, et al. 2019. CLAUDETTE: an Automated Detector of Potentially Unfair Clauses in Online Terms of Service.Artificial Intelligence and Law27, 2 (2019), 117–139. doi:10.1007/s10506-019-09243-2

work page doi:10.1007/s10506-019-09243-2 2019

[17] [17]

Mathur, G

Arunesh Mathur, Gunes Acar, Michael J. Friedman, Elena Lucherini, Jonathan Mayer, Marshini Chetty, and Arvind Narayanan. 2019. Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites.Proceedings of the ACM on ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions CHI EA ’26, April 13–17, 2026, Barcelona, Sp...

work page doi:10.1145/3359183 2019

[18] [18]

Célestin Matte, Nataliia Bielova, and Cristiana Santos. 2020. Do Cookie Banners Respect My Choice? Measuring Legal Compliance of Banners from IAB Europe’s Transparency and Consent Framework. In2020 IEEE Symposium on Security and Privacy (SP). IEEE, 791–809. doi:10.1109/SP40000.2020.00025

work page doi:10.1109/sp40000.2020.00025 2020

[19] [19]

Midas Nouwens, Ilaria Liccardi, Michael Veale, David Karger, and Lalana Kagal

work page

[20] [20]

URLhttp://dx.doi.org/10.1145/3313831.3376327

Dark Patterns after the GDPR: Scraping Consent Pop-ups and Demonstrat- ing Their Influence. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM. doi:10.1145/3313831.3376321

work page doi:10.1145/3313831.3376321 2020

[21] [21]

Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Ko- rczyński, and Wouter Joosen. 2019. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. InNetwork and Distributed System Security Symposium (NDSS). https://tranco-list.eu

work page 2019

[22] [22]

Data Programming: Creating Large Training Sets, Quickly

Alexander J. Ratner, Christopher M. De Sa, Sen Wu, Daniel Selsam, and Christo- pher Ré. 2017. Data Programming: Creating Large Training Sets, Quickly. In Advances in Neural Information Processing Systems (NeurIPS). https://arxiv.org/ abs/1605.07723

work page internal anchor Pith review Pith/arXiv arXiv 2017

[23] [23]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 3982–3992. https://arxiv. org/abs/1908.10084

work page internal anchor Pith review Pith/arXiv arXiv 2019

[24] [24]

Wooldridge

Jeffrey M. Wooldridge. 2010.Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press

work page 2010