On Large-Scale Retrieval: Binary or n-ary Coding?
read the original abstract
The growing amount of data available in modern-day datasets makes the need to efficiently search and retrieve information. To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches. Although binary coding has been popular for implementing both techniques, n-ary coding (known as Product Quantization) is also very effective for Distance Estimation. However, their relative performance has not been studied for Subset Indexing. We investigate whether binary or n-ary coding works better under different retrieval strategies. This leads to the design of a new n-ary coding method, "Linear Subspace Quantization (LSQ)" which, unlike other n-ary encoders, can be used as a similarity-preserving embedding. Experiments on image retrieval show that when Distance Estimation is used, n-ary LSQ outperforms other methods. However, when Subset Indexing is applied, interestingly, binary codings are more effective and binary LSQ achieves the best accuracy.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.