pith. sign in

arxiv: 1701.07675 · v2 · pith:4XLFO6CWnew · submitted 2017-01-26 · 💻 cs.IT · cs.CV· cs.IR· math.IT

Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

classification 💻 cs.IT cs.CVcs.IRmath.IT
keywords searchcodescodingproblemvectorsencodingfeaturegain
0
0 comments X
read the original abstract

This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases. Considering the ANN problem from an information-theoretic perspective, we interpret it as an encoding, which maps the original feature vectors to a less entropic sparse representation while requiring them to be as informative as possible. We then define the coding gain for ANN search using information-theoretic measures. We next show that the classical approach to this problem, which consists of binarization of the projected vectors is sub-optimal. Instead, a properly designed ternary encoding achieves higher coding gains and lower complexity.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.