pith. sign in

arxiv: 2601.12407 · v2 · submitted 2026-01-18 · 💻 cs.CR · cs.CL· cs.LG

De-Anonymization at Scale via Tournament-Style Attribution

Pith reviewed 2026-05-16 13:11 UTC · model grok-4.3

classification 💻 cs.CR cs.CLcs.LG
keywords de-anonymizationauthorship attributionlarge language modelsprivacy riskdouble-blind reviewtournament-style selectiontext matching
0
0 comments X

The pith

Large language models can link anonymous texts to their authors even from pools of tens of thousands of candidates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents De-Anonymization at Scale, a method that uses large language models to match anonymous documents to candidate authors through repeated partitioning and selection steps. It first applies dense retrieval to shrink the pool, then divides the remaining texts into small groups and asks the model to pick which one shares an author with a query text, iterating on the survivors until a short ranked list results. Multiple independent runs are combined with majority voting to stabilize the output. Tests on anonymized review submissions show the technique identifies same-author texts at rates far above chance in pools of tens of thousands, while also outperforming earlier methods on standard authorship benchmarks. This matters because it turns a theoretical privacy concern into a demonstrated practical risk for any platform that relies on anonymity.

Core claim

De-Anonymization at Scale attributes authorship by randomly partitioning a large candidate corpus into fixed-size groups, prompting an LLM to select the text most likely written by the same author as a given query, and iteratively narrowing the surviving candidates; dense-retrieval prefiltering reduces the initial search space and majority-voting aggregation across independent runs improves ranking precision, enabling recovery of same-author texts from pools of tens of thousands with accuracy well above chance on anonymized review data.

What carries the argument

Tournament-style sequential partitioning and LLM querying, augmented by dense-retrieval prefiltering and majority-voting aggregation, that iteratively narrows authorship candidates from large pools.

If this is right

  • Anonymous platforms such as double-blind review systems face a concrete risk of author identification from large candidate sets.
  • The method scales authorship attribution beyond what direct pairwise comparison could handle while improving accuracy on standard benchmarks like Enron emails.
  • Dense retrieval plus tournament querying together make LLM-based de-anonymization practical at real-world sizes.
  • Majority voting over independent runs reduces sensitivity to individual model mistakes in the selection steps.
  • The approach works on anonymized review data, showing the vulnerability is not limited to artificial or easy cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar risks could appear on any platform that publishes many anonymous texts from overlapping author pools if the texts remain accessible.
  • Testing the method on texts that authors have intentionally altered to hide style would clarify how robust the signals are to user countermeasures.
  • The scalability gain suggests the technique could be applied to much larger public corpora, such as social-media archives, once retrieval costs are managed.
  • Future platform defenses might need to combine text obfuscation with limits on how many candidate texts an attacker can access at once.

Load-bearing premise

Large language models can reliably detect authorship signals in anonymized texts even when topic, content, or deliberate style changes might mislead them.

What would settle it

Apply the same DAS procedure to a corpus of texts that have been deliberately rewritten by their authors to erase stylistic markers and check whether top-k accuracy falls to chance levels.

read the original abstract

As LLMs rapidly advance and enter real-world use, their privacy implications are increasingly important. We study an authorship de-anonymization threat: using LLMs to link anonymous documents to their authors, potentially compromising settings such as double-blind peer review. We propose De-Anonymization at Scale (DAS), a large language model-based method for attributing authorship among tens of thousands of candidate texts. DAS uses a sequential progression strategy: it randomly partitions the candidate corpus into fixed-size groups, prompts an LLM to select the text most likely written by the same author as a query text, and iteratively re-queries the surviving candidates to produce a ranked top-k list. To make this practical at scale, DAS adds a dense-retrieval prefilter to shrink the search space and a majority-voting style aggregation over multiple independent runs to improve robustness and ranking precision. Experiments on anonymized review data show DAS can recover same-author texts from pools of tens of thousands with accuracy well above chance, demonstrating a realistic privacy risk for anonymous platforms. On standard authorship benchmarks (Enron emails and blog posts), DAS also improves both accuracy and scalability over prior approaches, highlighting a new LLM-enabled de-anonymization vulnerability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes De-Anonymization at Scale (DAS), an LLM-based authorship attribution method that randomly partitions large candidate pools (tens of thousands of texts), sequentially queries an LLM to select the most likely same-author match, and aggregates rankings over multiple runs with dense-retrieval prefiltering. Experiments on anonymized review data claim recovery of same-author texts well above chance, and DAS is reported to improve accuracy and scalability over prior methods on Enron emails and blog-post benchmarks.

Significance. If the results hold after controlling for content leakage, the work would demonstrate a scalable, realistic de-anonymization threat to anonymous platforms such as double-blind review, extending LLM capabilities into a new privacy attack vector with practical implications for system design.

major comments (1)
  1. [Abstract / Experiments on anonymized review data] Abstract and experimental description: no topic-matched or subfield-controlled baselines are mentioned for the review-data experiments. Same-author documents in academic review pools are likely to share research topics, so sequential LLM selection could succeed via semantic similarity rather than stylistic authorship signals; this directly affects whether the above-chance accuracy supports the claimed de-anonymization capability.
minor comments (1)
  1. [Method] The description of the majority-voting aggregation and the exact prompting templates would benefit from explicit pseudocode or a numbered algorithmic listing to allow reproduction.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive feedback. We address the major comment on the review-data experiments below and will revise the manuscript accordingly to strengthen the claims.

read point-by-point responses
  1. Referee: Abstract and experimental description: no topic-matched or subfield-controlled baselines are mentioned for the review-data experiments. Same-author documents in academic review pools are likely to share research topics, so sequential LLM selection could succeed via semantic similarity rather than stylistic authorship signals; this directly affects whether the above-chance accuracy supports the claimed de-anonymization capability.

    Authors: We agree that isolating stylistic authorship signals from topical similarity is essential for validating the de-anonymization claim. The review-data experiments were designed to reflect realistic double-blind review settings, where same-author documents frequently share subfields. To address this directly, we will add a content-only baseline that uses only the dense-retrieval prefilter (without the LLM tournament or majority voting) and report its accuracy against full DAS on the same pools. We will also include a comparison to a simple topic-similarity oracle if subfield metadata can be recovered from the anonymized data. These additions will quantify how much the LLM-based attribution improves over pure semantic matching and clarify the contribution of authorship signals. revision: yes

Circularity Check

0 steps flagged

Empirical algorithm with no circular derivation chain

full rationale

The paper presents DAS as a practical algorithmic procedure (random partitioning into groups, LLM-based selection of same-author candidates, dense-retrieval prefilter, and majority-voting aggregation) and evaluates it via experiments on anonymized review data plus standard benchmarks. No equations, self-definitions, or load-bearing self-citations are shown that would reduce any claimed prediction or result to its own inputs by construction. The central claims rest on reported experimental accuracies rather than any mathematical reduction or renamed known result.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The method rests on the domain assumption that LLMs encode detectable authorship signals and on standard retrieval techniques; no new entities are introduced and free parameters such as group size and run count are implementation choices rather than fitted constants.

free parameters (2)
  • group size
    Fixed-size groups used in random partitioning; exact value not stated in abstract.
  • number of independent runs
    Used for majority-voting aggregation; exact count not stated in abstract.
axioms (1)
  • domain assumption LLMs can detect authorship signals in text even after anonymization
    Invoked as the basis for the prompting strategy in the sequential selection process.

pith-pipeline@v0.9.0 · 5510 in / 1285 out tokens · 39080 ms · 2026-05-16T13:11:16.602092+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.