pith. sign in

arxiv: 1901.04358 · v1 · pith:5YRZMLTOnew · submitted 2019-01-14 · 💻 cs.DS

Quotient Hash Tables - Efficiently Detecting Duplicates in Streaming Data

classification 💻 cs.DS
keywords dataquotientanalysisduplicateduplicatesfiltershashstreaming
0
0 comments X
read the original abstract

This article presents the Quotient Hash Table (QHT) a new data structure for duplicate detection in unbounded streams. QHTs stem from a corrected analysis of streaming quotient filters (SQFs), resulting in a 33\% reduction in memory usage for equal performance. We provide a new and thorough analysis of both algorithms, with results of interest to other existing constructions. We also introduce an optimised version of our new data structure dubbed Queued QHT with Duplicates (QQHTD). Finally we discuss the effect of adversarial inputs for hash-based duplicate filters similar to QHT.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.