A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.
arXiv preprint arXiv:2503.05500 , year =
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 4representative citing papers
Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.
A distillation-plus-task-contrastive training regimen yields compact embedding models that match or exceed state-of-the-art performance for their size while supporting 32k-token contexts and quantization.
Transformer models predict German political ideology on a continuous left-right scale, reaching F1 0.844 in-domain and MAE 0.172 on newspaper out-of-domain tests.
citing papers explorer
-
Is She Even Relevant? When BERT Ignores Explicit Gender Cues
A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.
-
Should We Still Pretrain Encoders with Masked Language Modeling?
Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.
-
jina-embeddings-v5-text: Task-Targeted Embedding Distillation
A distillation-plus-task-contrastive training regimen yields compact embedding models that match or exceed state-of-the-art performance for their size while supporting 32k-token contexts and quantization.
-
Ideology Prediction of German Political Texts
Transformer models predict German political ideology on a continuous left-right scale, reaching F1 0.844 in-domain and MAE 0.172 on newspaper out-of-domain tests.