pith. sign in

arxiv: 1207.3269 · v2 · pith:TUZ25BEJnew · submitted 2012-07-13 · 💻 cs.LG · cs.IT· math.IT

The Price of Privacy in Untrusted Recommendation Engines

classification 💻 cs.LG cs.ITmath.IT
keywords learningprivacysample-complexityuserdevelopinformationregimeapproach
0
0 comments X
read the original abstract

Recent increase in online privacy concerns prompts the following question: can a recommender system be accurate if users do not entrust it with their private data? To answer this, we study the problem of learning item-clusters under local differential privacy, a powerful, formal notion of data privacy. We develop bounds on the sample-complexity of learning item-clusters from privatized user inputs. Significantly, our results identify a sample-complexity separation between learning in an information-rich and an information-scarce regime, thereby highlighting the interaction between privacy and the amount of information (ratings) available to each user. In the information-rich regime, where each user rates at least a constant fraction of items, a spectral clustering approach is shown to achieve a sample-complexity lower bound derived from a simple information-theoretic argument based on Fano's inequality. However, the information-scarce regime, where each user rates only a vanishing fraction of items, is found to require a fundamentally different approach both for lower bounds and algorithms. To this end, we develop new techniques for bounding mutual information under a notion of channel-mismatch, and also propose a new algorithm, MaxSense, and show that it achieves optimal sample-complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to $(i)$ learning based on 1-bit sketches, and $(ii)$ adaptive learning, where queries can be adapted based on answers to past queries.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.