Entity Embeddings of Categorical Variables

Cheng Guo , Felix Berkhahn

Authors on Pith no claims yet

classification 💻 cs.LG

keywords categoricalentityvariablesembeddingneuraldataembeddingsfeatures

read the original abstract

We map categorical variables in a function approximation problem into Euclidean spaces, which are the entity embeddings of the categorical variables. The mapping is learned by a neural network during the standard supervised training process. Entity embedding not only reduces memory usage and speeds up neural networks compared with one-hot encoding, but more importantly by mapping similar values close to each other in the embedding space it reveals the intrinsic properties of the categorical variables. We applied it successfully in a recent Kaggle competition and were able to reach the third position with relative simple features. We further demonstrate in this paper that entity embedding helps the neural network to generalize better when the data is sparse and statistics is unknown. Thus it is especially useful for datasets with lots of high cardinality features, where other methods tend to overfit. We also demonstrate that the embeddings obtained from the trained neural network boost the performance of all tested machine learning methods considerably when used as the input features instead. As entity embedding defines a distance measure for categorical variables it can be used for visualizing categorical data and for data clustering.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Self-Improving Tabular Language Models via Iterative Group Alignment
cs.LG 2026-04 unverdicted novelty 7.0

TabGRAA enables self-improving tabular language models through iterative group-relative advantage alignment using modular automated quality signals like distinguishability classifiers.
$\phi-$DeepONet: A Discontinuity Capturing Neural Operator
cs.CE 2026-04 unverdicted novelty 6.0

φ-DeepONet learns mappings with discontinuities in inputs and outputs by combining multiple branch networks with a nonlinear interface embedding in the trunk, trained via physics- and interface-informed loss, and show...
Integrating SAINT with Tree-Based Models: A Case Study in Employee Attrition Prediction
cs.LG 2026-04 unverdicted novelty 2.0

Standalone tree-based models outperform both SAINT and SAINT-embedding hybrids for employee attrition prediction on tabular HR data.