Putting Privacy to the Test: Introducing Red Teaming for Research Data Anonymization
Pith reviewed 2026-05-25 07:04 UTC · model grok-4.3
The pith
Researchers can test anonymization by pitting one team trying to re-identify data against another trying to block it.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Simulating re-identification attacks by assigning opposing red and blue teams reveals weaknesses in anonymized research datasets that standard practices miss, leading to more actionable privacy improvements before publication.
What carries the argument
Red teaming versus blue teaming, where one team attempts to re-identify the data and the other prevents it, applied as a structured test of anonymization choices.
If this is right
- Researchers obtain concrete feedback on which variables or combinations remain risky after initial anonymization steps.
- Data release decisions can be documented with evidence from the simulated attacks rather than only policy checklists.
- The supplied materials allow the method to be repeated across different study designs and disciplines.
- Privacy protections can be iteratively strengthened during the red-blue sessions before final publication.
Where Pith is reading between the lines
- Departments could require a short red-teaming log as part of data-management plans for human-subjects research.
- The technique might extend to automated or semi-automated tools that flag high-risk records for manual review.
Load-bearing premise
That running an internal red-versus-blue exercise will surface clearer and more useful anonymization fixes than the methods researchers already use.
What would settle it
A controlled comparison in which the same dataset is anonymized once with red teaming and once without, then tested for re-identification success by an independent party.
Figures
read the original abstract
Recently, the data protection practices of researchers in human-computer interaction and elsewhere have gained attention. Initial results suggest that researchers struggle with anonymization, partly due to a lack of clear, actionable guidance. In this work, we propose simulating re-identification attacks using the approach of red teaming versus blue teaming: a technique commonly employed in security testing, where one team tries to re-identify data, and the other team tries to prevent it. We discuss our experience applying this method to data collected in a mixed-methods study in human-centered privacy. We present usable materials for researchers to apply red teaming when anonymizing and publishing their studies' data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes applying red teaming (simulating re-identification attacks) versus blue teaming to research data anonymization as a way to generate clearer, actionable guidance than current practices. It reports the authors' experience with this method on data from one mixed-methods study in human-centered privacy and supplies reusable materials for other researchers.
Significance. If the red-teaming process can be shown to produce measurable improvements, the work could help address documented difficulties researchers face with anonymization by importing an established adversarial testing technique from security. The provision of reusable materials is a concrete strength that supports potential adoption and reproducibility.
major comments (2)
- [Experience section] The central claim that red teaming produces clearer, actionable anonymization improvements rests on the authors' experience with a single mixed-methods study (described in the section on applying the method). No quantitative measures of re-identification risk reduction, attack success rates, or controlled comparisons to standard anonymization practices are reported, so the superiority claim remains untested.
- [Materials and discussion sections] The manuscript states that the approach was applied to data from one study and that materials are provided, but does not include any evaluation of whether independent teams using the materials achieve better anonymization outcomes or lower re-identification risk than existing methods.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We respond point-by-point to the major comments below, clarifying the manuscript's scope as an introduction based on experience rather than a quantitative evaluation, and indicating revisions to address the noted gaps.
read point-by-point responses
-
Referee: [Experience section] The central claim that red teaming produces clearer, actionable anonymization improvements rests on the authors' experience with a single mixed-methods study (described in the section on applying the method). No quantitative measures of re-identification risk reduction, attack success rates, or controlled comparisons to standard anonymization practices are reported, so the superiority claim remains untested.
Authors: The manuscript reports our experience applying the method to one study and does not include quantitative measures or controlled comparisons; it positions the work as proposing red teaming for this domain and supplying reusable materials rather than demonstrating empirical superiority. We will revise the experience and discussion sections to explicitly frame the benefits as derived from qualitative experience, remove any phrasing that could imply tested superiority, and add a limitations paragraph noting the absence of quantitative validation. revision: yes
-
Referee: [Materials and discussion sections] The manuscript states that the approach was applied to data from one study and that materials are provided, but does not include any evaluation of whether independent teams using the materials achieve better anonymization outcomes or lower re-identification risk than existing methods.
Authors: We agree that the manuscript provides materials from our single application but contains no evaluation by independent teams. Such an evaluation would require a separate multi-team study outside the scope of this introductory paper. We will expand the discussion to acknowledge this limitation and identify independent evaluation of the materials as an important direction for future research. revision: partial
Circularity Check
No circularity: proposal applies external red-teaming method to new domain without self-referential reduction
full rationale
The paper introduces red-teaming (an established security practice) for anonymization evaluation and reports experience from one mixed-methods study plus reusable materials. No equations, fitted parameters, or derivations exist. No self-citations are invoked as load-bearing uniqueness theorems or to smuggle ansatzes. The central claim is a methodological proposal rather than a result derived from its own inputs by construction. The absence of any mathematical or definitional chain that collapses to the paper's own outputs makes the work self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Researchers struggle with anonymization partly due to a lack of clear, actionable guidance.
Reference graph
Works this paper leans on
-
[1]
Hussein Abbass, Axel Bender, Svetoslav Gaidow, and Paul Whitbread. 2011. Computational Red Teaming: Past, Present and Future.IEEE Computational Intelligence Magazine6, 1 (2011), 30–42. doi:10.1109/MCI.2010.939578
-
[2]
Jacob Abbott, Haley MacLeod, Novia Nurain, Gustave Ekobe, and Sameer Patil. 2019. Local Standards for Anonymization Practices in Health, Wellness, Accessibility, and Aging Research at CHI. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, Glasgow Scotland Uk, 1–14. doi:10.1145/3290605.3300692
-
[3]
European Research Executive Agency. [n. d.]. Open Science. https://rea.ec.europa.eu/open-science_en
-
[4]
ACM Publications Board. 2021. ACM Publications Policy on Research Involving Human Participants and Subjects. https://www.acm.org/publications/policies/research-involving-human-participants-and-subjects
work page 2021
-
[5]
Tânia Carvalho, Nuno Moniz, Pedro Faria, and Luís Antunes. 2023. Survey on Privacy-Preserving Techniques for Microdata Publication.Comput. Surveys55, 14s (2023), 1–42. doi:10.1145/3588765
-
[6]
Brazilian National Congress. 2018. Brazilian Data Protection Law. https://www.gov.br/anpd/pt-br/centrais-de-conteudo/outros-documentos-e- publicacoes-institucionais/lgpd-en-lei-no-13-709-capa.pdf
work page 2018
-
[7]
Marilys Guillemin, Emma Barnard, Anton Allen, Paul Stewart, Hannah Walker, Doreen Rosenthal, and Lynn Gillam. 2018. Do Research Participants Trust Researchers or Their Institution?Journal of Empirical Research on Human Research Ethics13, 3 (2018), 285–294. doi:10.1177/1556264618763253
-
[8]
Wentao Guo, Paige Pepitone, Adam J Aviv, and Michelle L Mazurek. 2025. How Researchers De-Identify Data in Practice. InProceedings of the 34th USENIX Security Symposium. Seattle, WA, USA. https://www.usenix.org/system/files/usenixsecurity25-guo-wentao.pdf
work page 2025
-
[9]
Luisa Jansen, Nele Borgert, and Malte Elson. 2025. On the Tension Between Open Data and Data Protection in Research. doi:10.31234/osf.io/5jt3s_v2
-
[10]
Annalisa Landi, Mark Thompson, Viviana Giannuzzi, Fedele Bonifazi, Ignasi Labastida, Luiz Olavo Bonino Da Silva Santos, and Marco Roos. 2020. The “A” of FAIR – As Open as Possible, as Closed as Necessary.Data Intelligence2, 1-2 (2020), 47–55. doi:10.1162/dint_a_00027
-
[11]
Steve Mansfield-Devine. 2018. The Best Form of Defence: The Benefits of Red Teaming.Computer Fraud & Security2018, 10 (2018), 8–12. doi:10.1016/S1361-3723(18)30097-6
-
[12]
Florin Martius, Luisa Jansen, Lukas Struck, Arthi Arumugam, Lisa Geierhaas, Anna-Marie Ortloff, Matthew Smith, and Christian Tiefenau. 2025. Out of Sight, Out of Mind? Exploring Data Protection Practices for Personal Data in Usable Security & Privacy Studies. InCHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, Yokohama, Japan. doi:10.11...
-
[13]
Jonas Oppenlaender, Sylvain Malacria, Xinrui Fang, Niels van Berkel, Fanny Chevalier, Koji Yatani, and Simo Hosio. 2025. Meta-HCI: First Workshop on Meta-Research in HCI.CHI EA ’25, Yokohama, Japan(2025). doi:10.1145/3706599.3706723
-
[14]
European Parliament and Council of the European Union. 2016. General Data Protection Regulation. https://eur-lex.europa.eu/legal- content/EN/TXT/PDF/?uri=CELEX:32016R0679
work page 2016
-
[15]
Parliament of Canada. 2000. Personal Information Protection and Electronic Documents Act. https://www.priv.gc.ca/en/privacy-topics/privacy- laws-in-canada/the-personal-information-protection-and-electronic-documents-act-pipeda/
work page 2000
-
[16]
Parliament of Singapore. 2012. Personal Data Protection Act 2012. https://sso.agc.gov.sg/Act/PDPA2012
work page 2012
-
[17]
Johann Rehberger. 2020.Cybersecurity Attacks - Red Team Strategies: A Practical Guide to Building a Penetration Testing Program Having Homefield Advantage(1st ed.). Packt Publishing
work page 2020
-
[18]
Kavous Salehzadeh Niksirat, Lahari Goswami, Pooja S. B. Rao, James Tyler, Alessandro Silacci, Sadiq Aliyu, Annika Aebli, Chat Wacharamanotham, and Mauro Cherubini. 2023. Changes in Research Ethics, Openness, and Transparency in Empirical Studies between CHI 2017 and CHI 2022. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems....
-
[19]
Span, Ineke Wessel, and Vera Ellen Heininga
Don Van Ravenzwaaij, Marlon De Jong, Rink Hoekstra, Susanne Scheibe, Mark M. Span, Ineke Wessel, and Vera Ellen Heininga. 2025. De- Identification When Making Data Sets Findable, Accessible, Interoperable, and Eusable (FAIR): Two Worked Examples from the Behavioral and Social Sciences.Advances in Methods and Practices in Psychological Science8, 2 (2025), ...
-
[20]
Jenny Waycott, Cosmin Munteanu, Hilary Davis, Anja Thieme, Stacy Branham, Wendy Moncur, Roisin McNaney, and John Vines. 2017. Ethical Encounters in HCI: Implications for Research in Sensitive Settings. InProceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, Denver Colorado USA, 518–525. doi:10.1145/3027063.3027089
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.