Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

Dimche Kostadinov; Slava Voloshynovskiy; Sohrab Ferdowsi; Taras Holotyak

arxiv: 1701.07675 · v2 · pith:4XLFO6CWnew · submitted 2017-01-26 · 💻 cs.IT · cs.CV· cs.IR· math.IT

Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

Sohrab Ferdowsi , Slava Voloshynovskiy , Dimche Kostadinov , Taras Holotyak This is my paper

classification 💻 cs.IT cs.CVcs.IRmath.IT

keywords searchcodescodingproblemvectorsencodingfeaturegain

0 comments

read the original abstract

This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases. Considering the ANN problem from an information-theoretic perspective, we interpret it as an encoding, which maps the original feature vectors to a less entropic sparse representation while requiring them to be as informative as possible. We then define the coding gain for ANN search using information-theoretic measures. We next show that the classical approach to this problem, which consists of binarization of the projected vectors is sub-optimal. Instead, a properly designed ternary encoding achieves higher coding gains and lower complexity.

This paper has not been read by Pith yet.

Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

discussion (0)