An Effective Clustering Approach to Web Query Log Anonymization

Amin Milani Fard; Ke Wang

arxiv: 1012.0663 · v1 · pith:EIORNB6Nnew · submitted 2010-12-03 · 💻 cs.DB · cs.CR

An Effective Clustering Approach to Web Query Log Anonymization

Amin Milani Fard , Ke Wang This is my paper

classification 💻 cs.DB cs.CR

keywords dataqueryanonymizationinformationtransactionclusteringprivacyresults

0 comments

read the original abstract

Web query log data contain information useful to research; however, release of such data can re-identify the search engine users issuing the queries. These privacy concerns go far beyond removing explicitly identifying information such as name and address, since non-identifying personal data can be combined with publicly available information to pinpoint to an individual. In this work we model web query logs as unstructured transaction data and present a novel transaction anonymization technique based on clustering and generalization techniques to achieve the k-anonymity privacy. We conduct extensive experiments on the AOL query log data. Our results show that this method results in a higher data utility compared to the state of-the-art transaction anonymization methods.

This paper has not been read by Pith yet.

An Effective Clustering Approach to Web Query Log Anonymization

discussion (0)