arxiv: 2604.15641 · v1 · submitted 2026-04-17 · 💻 cs.CR

Recognition: unknown

Half-Moon Cookie: Private, Similarity-Based Blocklisting with TOCTOU-Attack Resilience

Xinyuan Zhang , Anrin Chakraborti , Michael K. Reiter

Authors on Pith no claims yet

Pith reviewed 2026-05-10 08:43 UTC · model grok-4.3

classification 💻 cs.CR

keywords private blocklistingsimilarity searchTOCTOU resilienceprivacy-preserving protocolsmalware detectionmetric space queriescryptographic blocklists

0 comments

The pith

A client can test whether an item is similar to any entry on a secret blocklist without revealing the item or the list, with total cost equal to the sum of embedding and checking rather than their product, plus fast re-verification of prior

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Half-Moon Cookie, a framework that lets a client determine if its item lies close to any element on a server's proprietary blocklist in a metric space while keeping both the item and the blocklist hidden. The design performs embedding of the item separately from the actual membership test, so running time grows with the sum of those two costs instead of their product. It further supplies a lightweight way to confirm that an item already passed the check at an earlier time. This combination supports workflows in which one party performs the full check and another party only needs to verify the result before use, thereby closing the window for time-of-check-to-time-of-use attacks. The authors show how the approach can be realized for similarity-based detection of malicious executables.

Core claim

By decoupling the embedding computation from the subsequent blocklist membership test and adding an efficient proof that a prior check succeeded, a client and server can perform private similarity-based blocklisting whose cost is additive rather than multiplicative, while recipients can cheaply confirm the earlier result and thereby resist TOCTOU attacks.

What carries the argument

The separation of embedding from the blocklist check together with an efficient confirmation primitive that an item previously passed the check.

If this is right

Performance of private similarity checks scales linearly with embedding cost plus check cost instead of their product.
One party can perform the full private check on an item and another party can later confirm the result with low overhead before using the item.
The same construction directly yields a privacy-preserving method for similarity-based malware detection that hides both client inputs and the blocklist itself.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same separation pattern could be applied to other similarity-filtering tasks such as private content moderation or fraud screening where one party vets and another consumes.
If the embedding is itself learned from data, the framework might be combined with existing machine-learning pipelines for blocklist construction without extra privacy leakage.
The fast confirmation step could be used in distributed systems to move expensive checks off the critical path while still guaranteeing freshness against TOCTOU.

Load-bearing premise

A suitable embedding function exists that preserves the needed similarity relation while still permitting an efficient private distance check whose security remains intact when embedding and checking are performed separately.

What would settle it

An embedding that preserves distances for the target application yet forces either a multiplicative performance penalty or a security loss once the embedding step is moved outside the check.

Figures

Figures reproduced from arXiv: 2604.15641 by Anrin Chakraborti, Michael K. Reiter, Xinyuan Zhang.

**Figure 1.** Figure 1: Ideal Functionalities used in Half Moon Embed-and-Map FEM, that embeds Csnd’s input into the metric space F θ , and ii) Test-and-Commit FTC, that performs the predicate check blockedL,T (·) on the embedded input against the embedded blocklist L provided by S. Lastly, during implicit check, Crcv and S participate in iii) Implicit check FIC, that allows fast verification on a previously checked input. We pre… view at source ↗

**Figure 2.** Figure 2: Framework interaction narrow timeframe, substantially lowering the probability of successful exploitation. Second, there exists an unavoidable delay for newly emerged threats to be reflected in S’s state, creating a potential zero-day exposure. We argue that this residual delay between server-side updates and client-side checks is inherent to any distributed system and does not represent a weakness specif… view at source ↗

**Figure 3.** Figure 3: Half Moon framework verified by a previous explicit check. We will show next that this happens with probability ≤ |F|/|KP|. Suppose that the adversary provides inputs w, m⃗ to FEM, and p⃗′ 1 , p⃗′ 2 to FTC. Let ⃗γ be the allowlist token stored at S after the explicit check. Now consider that the adversary provides w ′ , m⃗ ′ to Crcv. It must be that ⃗γ = fkf (w ′ ) + ⃗s × H(w ′ , fkf (w ′ ), kf , m⃗ ′ ) (5… view at source ↗

**Figure 4.** Figure 4: High-level description of Half Moon instantiation (h ∗ ← H(w, fkf (w), kf , m⃗ )) 7. Evaluation We implemented the Half Moon-based malware detection6 in C++11. We benchmarked these implementations on a Ubuntu 22.04 machine with 12 cores and 32GB of RAM as the server, and multiple machines that are each equipped with 4 cores and 16GB of RAM as the clients. We used the emptoolkit [9] for OT and garbled ci… view at source ↗

**Figure 5.** Figure 5: Email attachment size distribution for collecting the general email attachment size distribution. In [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Explicit check throughput. Red line (—) is the same parameter sizes as the black line (—) but with the explicit check functionality implemented in a monolithic garbled circuit. clients. In Fig. 6a, we show results of Half Moon when varying the size of the blocklist, |L|. We observe that the overall throughput degrades more quickly as the number of entries increases, because a larger blocklist directly incr… view at source ↗

**Figure 7.** Figure 7: Truth Table for XOR Gates We start with bit flipping. In CRGC, bit flipping transforms a garbled circuit to be reusable, which is the foundation of Lemma 2. Bit flipping refers to applying a one-time pad over a and all wires in the circuit C to obtain C ′ and a ′ . Evaluator input b and final output wires do not get flipped. A flipped wire needs to be modified so that the truth table of its child gates mai… view at source ↗

**Figure 8.** Figure 8: ΠEM: Embed-and-Map protocol C.2. Test-and-Commit The protocol ΠTC ( [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗

**Figure 9.** Figure 9: ΠTC: Test-and-Commit protocol C.3. Implicit Check The protocol ΠIC ( [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗

**Figure 10.** Figure 10: ΠIC: Implicit check protocol Appendix D. Proof of Framework Security In this appendix, we formally prove the framework of [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

read the original abstract

Blocklisting is a common technique for preventing the use of known malicious content. However, conventional blocklisting infrastructures require either the blocklist to be public or clients to reveal their queries to the blocklist server. In this work, we introduce a private blocklisting framework, Half-Moon Cookie, by which a client can check an item against a proprietary blocklist held by a server, to determine whether the item is close to any blocklist element in a metric space. Critically, our design separates the embedding step from the blocklist check, so that performance degrades with their sum and not their product. Still, this check might be too costly to perform on the critical path of using the item, and so our design also supports a very efficient check that an item previously passed the blocklist check. In doing so, we support applications where one client can perform the blocklist check on the item before sending it, and recipients can more efficiently confirm the previous result before using the item, thereby avoiding TOCTOU attacks. We demonstrate how Half-Moon Cookie can be instantiated for similarity-based malware detection, enabling effective identification of malicious executables without revealing client inputs or disclosing the underlying blocklist.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a private similarity blocklisting scheme that decouples embedding from the check to get additive rather than multiplicative costs and adds a fast re-verification path for TOCTOU safety, but the abstract gives almost no protocol, proof, or measurement details to assess whether it works.

read the letter

The main thing to know is that Half-Moon Cookie separates the embedding step from the private blocklist check so performance scales with their sum instead of their product, and it includes an efficient way to re-confirm that an item already passed the check. This is meant to support use cases like one party checking a file before sending it and the recipient doing a quick verification to avoid TOCTOU attacks in malware detection or similar settings without revealing queries or the blocklist itself.

Referee Report

3 major / 2 minor

Summary. The paper introduces Half-Moon Cookie, a private blocklisting framework allowing a client to check whether an item is close (in a metric space) to any element of a server's proprietary blocklist without revealing the item or the list. The design decouples the embedding step from the blocklist check so that performance cost is additive rather than multiplicative, and adds an efficient re-check primitive for items that previously passed the blocklist test. This re-check is intended to support TOCTOU-resilient workflows in which one party performs the full check and another performs only the lightweight confirmation. The framework is instantiated for similarity-based malware detection.

Significance. If the construction is sound, the separation of embedding from checking and the efficient re-check primitive would be useful contributions to privacy-preserving security infrastructure. The approach could enable practical private blocklisting in settings such as malware detection where both client inputs and the blocklist itself must remain confidential. The emphasis on additive rather than multiplicative cost and on TOCTOU resilience directly addresses deployment constraints that existing private-set or private-similarity schemes often leave unaddressed.

major comments (3)

[Abstract and §1] Abstract and §1: the central performance claim—that cost scales with the sum rather than the product of embedding and check—is load-bearing for the contribution, yet no concrete protocol, complexity analysis, or security reduction is supplied to show that decoupling preserves both correctness and privacy.
[§3] §3 (Design): the security of the private distance check after decoupling rests on the existence of an embedding that preserves the required similarity relation while permitting an efficient, private distance test; no formal definition of the embedding properties, security model, or reduction to standard assumptions is given.
[Evaluation] Evaluation section: no performance measurements, comparison to baselines, or concrete security analysis against TOCTOU or embedding-leakage attacks are reported, leaving the practical claims unverified.

minor comments (2)

The connection between the name 'Half-Moon Cookie' and the technical construction is not explained.
[§2] Notation for the metric space and distance function should be introduced consistently before the protocol description.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their positive summary and for identifying the areas where the manuscript requires additional rigor. We address each major comment below and will perform a major revision incorporating formal protocol details, security definitions, and evaluation results.

read point-by-point responses

Referee: [Abstract and §1] the central performance claim—that cost scales with the sum rather than the product of embedding and check—is load-bearing for the contribution, yet no concrete protocol, complexity analysis, or security reduction is supplied to show that decoupling preserves both correctness and privacy.

Authors: We agree the decoupling claim is central and currently lacks supporting detail. The manuscript presents the framework conceptually. In revision we will supply a concrete protocol, asymptotic complexity analysis establishing additive rather than multiplicative cost, and a security reduction showing that the separation preserves correctness and privacy under the stated assumptions. revision: yes
Referee: [§3] §3 (Design): the security of the private distance check after decoupling rests on the existence of an embedding that preserves the required similarity relation while permitting an efficient, private distance test; no formal definition of the embedding properties, security model, or reduction to standard assumptions is given.

Authors: The current §3 relies on the existence of a suitable embedding without formalizing its properties. We will revise the section to define the required embedding properties (similarity preservation and compatibility with private distance testing), state the security model explicitly, and provide a reduction to standard cryptographic assumptions. revision: yes
Referee: [Evaluation] Evaluation section: no performance measurements, comparison to baselines, or concrete security analysis against TOCTOU or embedding-leakage attacks are reported, leaving the practical claims unverified.

Authors: The present manuscript is design-focused and contains no empirical results. We will add a dedicated evaluation section that reports performance measurements for the malware-detection instantiation, comparisons against relevant baselines, and concrete analysis of TOCTOU resilience together with potential embedding-leakage attacks. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's abstract and high-level description introduce a private blocklisting design that decouples embedding from the check (yielding additive rather than multiplicative costs) and adds an efficient prior-result re-check for TOCTOU resilience. No equations, fitted parameters, self-citations, or derivation steps appear that reduce any claimed property to a quantity defined by the authors' own inputs or prior results. The central premise rests on the external existence of a suitable embedding function that preserves similarity while enabling private distance checks; this is stated as an assumption rather than derived internally. The provided text therefore contains no load-bearing steps that collapse by construction, self-definition, or self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review limited to abstract; no concrete free parameters, ad-hoc axioms, or invented entities are stated. The framework implicitly rests on standard cryptographic assumptions for private similarity search.

axioms (1)

standard math Existence of a secure embedding function and private distance protocol under standard cryptographic assumptions
The design requires these primitives to achieve privacy and the claimed performance separation.

pith-pipeline@v0.9.0 · 5517 in / 1205 out tokens · 31631 ms · 2026-05-10T08:43:31.894763+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

112 extracted references · 2 canonical work pages

[1]

Customer advisory: An attack, leveraging spam email, Teams, SharePoint, and OneDrive, potentially linked to the ransomware group Sangria Tempest,

“Customer advisory: An attack, leveraging spam email, Teams, SharePoint, and OneDrive, potentially linked to the ransomware group Sangria Tempest,” https://www.deepwatch.com/labs/spam- email-attach-teams-sharepoint-and-onedrive- potentially-linked-to-the-ransomware-group- sangria-tempest/, 2024

2024
[2]

EasyList,

“EasyList,” https://easylist.to, 2016

2016
[3]

Image hash list,

“Image hash list,” https://www.iwf.org.uk/our- technology/our-services/image-hash-list/
[4]

NCMEC, Google and image hashing technology,

“NCMEC, Google and image hashing technology,” https://safety.google/stories/hash- matching-to-help-ncmec/
[5]

Don’t route or peer lists (DROP),

“Don’t route or peer lists (DROP),” https://www. spamhaus.org/blocklists/do-not-route-or-peer/
[6]

Don’t panic. Protect your network,

“Don’t panic. Protect your network,” https:// www.team-cymru.com/bogon-networks
[7]

Black and white cookie,

“Black and white cookie,” https://en.wikipedia. org/wiki/Black and white cookie, 8 Sep. 2025

2025
[8]

CWE-367: Time-of-check time-of-use (TOC- TOU) race condition,

“CWE-367: Time-of-check time-of-use (TOC- TOU) race condition,” https://cwe.mitre.org/ data/definitions/367.html, 2006

2006
[9]

emp-toolkit,

“emp-toolkit,” https://github.com/emp-toolkit
[10]

NTL: A library for doing number theory,

“NTL: A library for doing number theory,” https://libntl.org
[11]

Internet x.509 public key infrastructure certificate and certificate revocation list (CRL) profile,

“Internet x.509 public key infrastructure certificate and certificate revocation list (CRL) profile,” https://datatracker.ietf.org/doc/html/ rfc5280, 2008

2008
[12]

Simple mail transfer protocol,

“Simple mail transfer protocol,” https://datatracker.ietf.org/doc/html/rfc5321, 2008

2008
[13]

Message submission for mail,

“Message submission for mail,” https://datatracker.ietf.org/doc/html/rfc6409, 2011

2011
[14]

MPC-friendly commitments for pub- licly verifiable covert security,

N. Agrawal, J. Bell, A. Gasc ´on, and M. J. Kusner, “MPC-friendly commitments for pub- licly verifiable covert security,” inACM Confer- ence on Computer and Communications Secu- rity, 2021

2021
[15]

Stronger security for reusable gar- bled circuits, general definitions and attacks,

S. Agrawal, “Stronger security for reusable gar- bled circuits, general definitions and attacks,” in Advances in Cryptology – CRYPTO, 2017

2017
[16]

Reef: fast succinct non-interactive zero-knowledge regex proofs,

S. Angel, E. Ioannidis, E. Margolin, S. Setty, and J. Woods, “Reef: fast succinct non-interactive zero-knowledge regex proofs,” inUSENIX Se- curity Symposium, 2024

2024
[17]

Adaptive garbled circuits and garbled RAM from non-programmable random oracles,

C. Barnum, D. Heath, V . Kolesnikov, and R. Os- trovsky, “Adaptive garbled circuits and garbled RAM from non-programmable random oracles,” https://eprint.iacr.org/2023/1527, 2023

2023
[18]

On garbling schemes with and with- out privacy,

C. Baum, “On garbling schemes with and with- out privacy,” inInternational Conference on Se- curity and Cryptography for Networks, 2016

2016
[19]

Random oracles are practical: A paradigm for designing efficient protocols,

M. Bellare and P. Rogaway, “Random oracles are practical: A paradigm for designing efficient protocols,” inACM Conference on Computer and Communications Security, 1993

1993
[20]

Introduction to modern cryptography,

——, “Introduction to modern cryptography,” https://web.cs.ucdavis.edu/∼rogaway/classes/ 227/spring05/book/main.pdf, 2004

2004
[21]

More than DNS: The 14 hour AWS us-east-1 outage,

J. Belotti, “More than DNS: The 14 hour AWS us-east-1 outage,” https://thundergolfer. com/blog/aws-us-east-1-outage-oct20, 26 Oct. 2025

2025
[22]

Efficient server-aided secure two-party function evalua- tion with applications to genomic computation,

M. Blanton and F. Bayatbabolghani, “Efficient server-aided secure two-party function evalua- tion with applications to genomic computation,” Proceedings on Privacy Enhancing Technolo- gies, 2016

2016
[23]

Assumption-free fuzzy PSI via predicate encryption,

E. Blass and G. Noubir, “Assumption-free fuzzy PSI via predicate encryption,” https://eprint.iacr. org/2025/217, 2025

2025
[24]

About the SEP Intensive Protection file reputation service,

Broadcom, “About the SEP Intensive Protection file reputation service,” https://techdocs.broadcom.com/us/en/symantec- security-software/information-security/data- loss-prevention/16-1/about-discovering-and- preventing-data-loss-on-endpoints/about- the-sep-intensive-protection-file-reputation- service.html, 4 Sep. 2025

2025
[25]

On the resemblance and con- tainment of documents,

A. Z. Broder, “On the resemblance and con- tainment of documents,” inCompression and Complexity of SEQUENCES, 1997

1997
[26]

Partial and fully homomorphic matching of IP addresses against blacklists for threat analysis,

W. J. Buchanan and H. Ali, “Partial and fully homomorphic matching of IP addresses against blacklists for threat analysis,” arXiv, Tech. Rep., 2025

2025
[27]

Private intersection of certified sets,

J. Camenisch and G. M. Zaverucha, “Private intersection of certified sets,” inInternational Conference on Financial Cryptography and Data Security, 2009

2009
[28]

SplitScreen: Enabling efficient, distributed malware detec- tion,

S. K. Cha, I. Moraru, J. Jang, J. Truelove, D. Brumley, and D. G. Andersen, “SplitScreen: Enabling efficient, distributed malware detec- tion,” in7 th USENIX Symposium on Networked System Design and Implementation, Apr. 2010

2010
[29]

Distance-aware private set intersection,

A. Chakraborti, G. Fanti, and M. K. Reiter, “Distance-aware private set intersection,” in USENIX Security Symposium, 2023

2023
[30]

Circuit- PSI with linear complexity via relaxed batch OPPRF,

N. Chandran, D. Gupta, and A. Shah, “Circuit- PSI with linear complexity via relaxed batch OPPRF,” inProceedings on Privacy Enhancing Technologies, 2022

2022
[31]

HOLMES: A plat- form for detecting malicious inputs in secure collaborative computation,

I. Chang, K. Sotiraki, W. Chen, M. Kantar- cioglu, and R. A. Popa, “HOLMES: A plat- form for detecting malicious inputs in secure collaborative computation,” inUSENIX Security Symposium, 2023

2023
[32]

Similarity estimation tech- niques from rounding algorithms,

M. S. Charikar, “Similarity estimation tech- niques from rounding algorithms,” inACM Sym- posium on Theory of Computing, May 2002

2002
[33]

Approximate PSI with near-linear communication,

W. Chongchitmate, S. Lu, and R. Os- trovsky, “Approximate PSI with near-linear communication,” 2024. [Online]. Available: https://eprint.iacr.org/2024/682

2024
[34]

File reputation filtering and file analysis,

Cisco, “File reputation filtering and file analysis,” https://www.cisco.com/c/en/us/td/ docs/security/ces/ces 15-5-3/user guide/b ESA Admin Guide ces 15-5-3/b ESA Admin Guide 12 1 chapter 010001.pdf
[35]

The string edit distance matching problem with moves,

G. Cormode and S. Muthukrishnan, “The string edit distance matching problem with moves,” in ACM Symposium on Discrete Algorithms, 2002

2002
[36]

Prio: private, robust, and scalable computation of aggregate statistics,

H. Corrigan-Gibbs and D. Boneh, “Prio: private, robust, and scalable computation of aggregate statistics,” inUSENIX Symposium on Networked System Design and Implementation, 2017

2017
[37]

Child sexual abuse material (CSAM) identification and reporting for U.S. based companies,

“Child sexual abuse material (CSAM) identification and reporting for U.S. based companies,” https://technologycoalition.org/wp- content/uploads/CSAM-Identification Reporting R3-1.pdf, 2022

2022
[38]

Locality-sensitive hashing scheme based on p-stable distributions,

M. Datar, N. Immorlica, P. Indyk, and V . S. Mir- rokni, “Locality-sensitive hashing scheme based on p-stable distributions,” inACM Symposium on Computational Geometry, 2004

2004
[39]

DNS blacklists and whitelists,

“DNS blacklists and whitelists,” https://datatracker.ietf.org/doc/html/rfc5782, 2010

2010
[40]

SPABox: Safeguarding privacy during deep packet inspection at a middlebox,

J. Fan, C. Guan, K. Ren, Y . Cui, and C. Qiao, “SPABox: Safeguarding privacy during deep packet inspection at a middlebox,”IEEE/ACM Transactions on Networking, 2017

2017
[41]

Ran- dom oracles with(out) programmability,

M. Fischlin, A. Lehmann, T. Ristenpart, T. Shrimpton, M. Stam, and S. Tessaro, “Ran- dom oracles with(out) programmability,” inAd- vances in Cryptology – ASIACRYPT, 2010

2010
[42]

Limits on the power of gar- bling techniques for public-key encryption,

S. Garg, M. Hajiabadi, M. Mahmoody, and A. Mohammed, “Limits on the power of gar- bling techniques for public-key encryption,” in Advances in Cryptology – CRYPTO, 2018

2018
[43]

Oblivious key-value stores and amplification for private set intersection,

G. Garimella, B. Pinkas, M. Rosulek, N. Trieu, and A. Yanai, “Oblivious key-value stores and amplification for private set intersection,” in Advances in Cryptology – CRYPTO, 2021, p. 395–425

2021
[44]

Structure-aware private set intersection, with applications to fuzzy matching,

G. Garimella, M. Rosulek, and J. Singh, “Structure-aware private set intersection, with applications to fuzzy matching,” inAdvances in Cryptology – CRYPTO, 2022

2022
[45]

Compu- tation efficient structure-aware PSI from incre- mental function secret sharing,

G. Garimella, B. Goff, and P. Miao, “Compu- tation efficient structure-aware PSI from incre- mental function secret sharing,” inAdvances in Cryptology – CRYPTO, 2024

2024
[46]

An algebraic app- proach to maliciously secure private set inter- section,

S. Ghosh and T. Nilges, “An algebraic app- proach to maliciously secure private set inter- section,” inAdvances in Cryptology – EURO- CRYPT, 2019

2019
[47]

The communication complexity of threshold private set intersection,

S. Ghosh and M. Simkin, “The communication complexity of threshold private set intersection,” inAdvances in Cryptology – CRYPTO, 2019

2019
[48]

Mali- ciously secure oblivious linear function evalu- ation with constant overhead,

S. Ghosh, J. B. Nielsen, and T. Nilges, “Mali- ciously secure oblivious linear function evalu- ation with constant overhead,” inAdvances in Cryptology – ASIACRYPT, 2017

2017
[49]

Reusable gar- bled circuits and succinct functional encryption,

S. Goldwasser, Y . Kalai, R. A. Popa, V . Vaikun- tanathan, and N. Zeldovich, “Reusable gar- bled circuits and succinct functional encryption,” inACM Symposium on Theory of Computing, 2013, p. 555–564

2013
[50]

Real-time, privacy-preserving url protection,

“Real-time, privacy-preserving url protection,” https://security.googleblog.com/2024/03/blog- post.html, 2024

2024
[51]

CRGC – a practical framework for constructing reusable garbled circuits,

C. Harth-Kitzerow, G. Carle, F. Fei, A. Luckow, and J. Klepsch, “CRGC – a practical framework for constructing reusable garbled circuits,” in International Conference on Security and Cryp- tography, 2022

2022
[52]

Ember2024 - a benchmark dataset for holistic evaluation of malware classifiers,

R. J. Joyce, G. Miller, P. Roth, R. Zak, E. Zaresky-Williams, H. Anderson, E. Raff, and J. Holt, “Ember2024 - a benchmark dataset for holistic evaluation of malware classifiers,” in ACM SIGKDD, 2025

2025
[53]

Efficiently enforcing input validity in secure two-party computation,

J. Katz, A. J. Malozemoff, and X. Wang, “Efficiently enforcing input validity in secure two-party computation,” 2016. [Online]. Available: https://eprint.iacr.org/2016/184

2016
[54]

Pcsf: Privacy-preserving content-based spam filter,

I. Kim, W. Susilo, J. Baek, J. Kim, and Y . W. Chow, “Pcsf: Privacy-preserving content-based spam filter,” 2023

2023
[55]

Privacy-preserving set operations,

L. Kissner and D. Song, “Privacy-preserving set operations,” inAdvances in Cryptology – CRYPTO, 2005

2005
[56]

The enron corpus: A new dataset for email classification research,

B. Klimt and Y . Yang, “The enron corpus: A new dataset for email classification research,” inEuropean Conference on Machine Learning, 2004

2004
[57]

Private blocklist lookups with checklist,

D. Kogan and H. Corrigan-Gibbs, “Private blocklist lookups with checklist,” inUSENIX Security Symposium, 2021

2021
[58]

Identifying almost identical files using context triggered piecewise hashing,

J. Kornblum, “Identifying almost identical files using context triggered piecewise hashing,” 2006

2006
[59]

Identifying harmful media in end-to-end encrypted commu- nication: Efficient private membership computa- tion,

A. Kulshrestha and J. R. Mayer, “Identifying harmful media in end-to-end encrypted commu- nication: Efficient private membership computa- tion,” inUSENIX Security Symposium, 2021

2021
[60]

Embark: Security outsourcing mid- dleboxes to the cloud,

C. Lan, J. Sherry, R. Ada Popa, S. Ratnasamy, and Z. Liu, “Embark: Security outsourcing mid- dleboxes to the cloud,” in13 th USENIX Sympo- sium on Networked System Design and Imple- mentation, Mar. 2016

2016
[61]

BlindFilter: Privacy-preserving spam email detection using homomorphic encryption,

D. Lee, M. Ahn, H. Kwak, J. B. Hong, and H. Kim, “BlindFilter: Privacy-preserving spam email detection using homomorphic encryption,” 2023

2023
[62]

Protocols for checking com- promised credentials,

L. Li, B. Pal, J. Ali, N. Sullivan, R. Chatterjee, and T. Ristenpart, “Protocols for checking com- promised credentials,” inACM Conference on Computer and Communications Security, 2019

2019
[63]

Ex- perimental study of fuzzy hashing in malware clustering analysis,

Y . Li, S. C. Sundaramurthy, A. G. Bardas, X. Ou, D. Caragea, X. Hu, and J. Jang, “Ex- perimental study of fuzzy hashing in malware clustering analysis,” inUSENIX Conference on Cyber Security Experimentation and Test, 2015

2015
[64]

Privacy-preserving regular ex- pression matching using tnfa,

N. Luo, C. Weng, J. Singh, G. Tan, M. Raykova, and R. Piskac, “Privacy-preserving regular ex- pression matching using tnfa,” 2024

2024
[65]

MalwareBazaar,

“MalwareBazaar,” https://bazaar.abuse.ch
[66]

Private processing of outsourced network functions: Feasibility and construc- tions,

L. Melis, H. J. Asghar, E. De Cristofaro, and M. A. Kaafar, “Private processing of outsourced network functions: Feasibility and construc- tions,” inACM International Workshop on Secu- rity in Software Defined Networks and Network Function Virtualization, Mar. 2016

2016
[67]

Turn on cloud protection in Microsoft Defender Antivirus,

Microsoft Defender, “Turn on cloud protection in Microsoft Defender Antivirus,” https://learn.microsoft.com/en-us/defender- endpoint/enable-cloud-protection-microsoft- defender-antivirus, 20 Oct. 2025

2025
[68]

File hosting services misused for identity phishing,

Microsoft Threat Intelligence, “File hosting services misused for identity phishing,” https://www.microsoft.com/en- us/security/blog/2024/10/08/file-hosting- services-misused-for-identity-phishing, 2024

2024
[69]

Clou- dA V: N-version antivirus in the network cloud,

J. Oberheide, E. Cooke, and F. Jahanian, “Clou- dA V: N-version antivirus in the network cloud,” in17 th USENIX Security Symposium, Jul. 2008

2008
[70]

Tlsh – a locality sensitive hash,

J. Oliver, C. Cheng, and Y . Chen, “Tlsh – a locality sensitive hash,” inCybercrime and Trustworthy Computing Workshop, 2013

2013
[71]

Might i get pwned: A sec- ond generation compromised credential check- ing service,

B. Pal, M. Islam, M. S. Bohuk, N. Sullivan, L. Valenta, T. Whalen, C. Wood, T. Ristenpart, and R. Chatterjee, “Might i get pwned: A sec- ond generation compromised credential check- ing service,” inUSENIX Security Symposium, 2022

2022
[72]

Privacy preserving spam filtering,

M. A. Pathak, M. Sharifi, and B. Raj, “Privacy preserving spam filtering,”arXiv:1102.4021, 2011

work page arXiv 2011
[73]

PhotoDNA,

“PhotoDNA,” https://www.microsoft.com/en-us/ photodna
[74]

Proofpoint et intelligence,

“Proofpoint et intelligence,” https: //www.proofpoint.com/sites/default/files/ proofpoint et intelligence ds r5.pdf
[75]

V ole-psi: Fast oprf and circuit-psi from vector-ole,

P. Rindal and P. Schoppmann, “V ole-psi: Fast oprf and circuit-psi from vector-ole,” inAd- vances in Cryptology – EUROCRYPT, 2021

2021
[76]

Data fingerprinting with similarity digests,

V . Roussev, “Data fingerprinting with similarity digests,” inAdvances in Digital Forensics, 2010

2010
[77]

Privacy preserving schema and data matching,

M. Scannapieco, I. Figotin, E. Bertino, and A. K. Elmagarmid, “Privacy preserving schema and data matching,” inACM SIGMOD, 2007

2007
[78]

Network reporting,

“Network reporting,” https://www.shadowserver. org/what-we-do/network-reporting/, 2006

2006
[79]

BlindBox: Deep packet inspection over encrypted traffic,

J. Sherry, C. Lan, R. Ada Popa, and S. Rat- nasamy, “BlindBox: Deep packet inspection over encrypted traffic,” inACM SIGCOMM, Aug. 2015

2015
[80]

Spamhaus blocklist (SBL),

“Spamhaus blocklist (SBL),” https://www. spamhaus.org/blocklists/spamhaus-blocklist/

Showing first 80 references.