URL https://huggingface.co/datasets/Shitao/ bge-m3-data

Shitao/bge-m3-data datasets at Hugging Face · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Comparison of Modern Multilingual Text Embedding Techniques for Hate Speech Detection Task

cs.CL · 2026-04-16 · unverdicted · novelty 4.0

Supervised models using embeddings like jina and e5 reach up to 92% accuracy on multilingual hate speech detection, substantially outperforming anomaly detection, while PCA to 64 dimensions preserves most performance in the supervised case.

citing papers explorer

Showing 1 of 1 citing paper.

Comparison of Modern Multilingual Text Embedding Techniques for Hate Speech Detection Task cs.CL · 2026-04-16 · unverdicted · none · ref 64
Supervised models using embeddings like jina and e5 reach up to 92% accuracy on multilingual hate speech detection, substantially outperforming anomaly detection, while PCA to 64 dimensions preserves most performance in the supervised case.

URL https://huggingface.co/datasets/Shitao/ bge-m3-data

fields

years

verdicts

representative citing papers

citing papers explorer