On the Dimensionality of Embeddings for Sparse Features and Data

Maxim Naumov

arxiv: 1901.02103 · v1 · pith:K5SRROVFnew · submitted 2019-01-07 · 💻 cs.LG · cs.CV· cs.IT· math.IT· stat.ML

On the Dimensionality of Embeddings for Sparse Features and Data

Maxim Naumov This is my paper

classification 💻 cs.LG cs.CVcs.ITmath.ITstat.ML

keywords dimensionalityembeddingssparsedataembeddingfeaturesitemreduce

0 comments

read the original abstract

In this note we discuss a common misconception, namely that embeddings are always used to reduce the dimensionality of the item space. We show that when we measure dimensionality in terms of information entropy then the embedding of sparse probability distributions, that can be used to represent sparse features or data, may or not reduce the dimensionality of the item space. However, the embeddings do provide a different and often more meaningful representation of the items for a particular task at hand. Also, we give upper bounds and more precise guidelines for choosing the embedding dimension.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Transformer-Based Active Learning for Data-Efficient Vaccine Epitope Selection in PRRS
q-bio.BM 2026-06 unverdicted novelty 3.0

Transformer models under active learning classify high-binding epitopes from a small docking dataset more accurately than random sampling or other architectures in low-data regimes for PRRS.