Multilingual embedding probes achieve strong in-distribution CEFR prediction (QWK ≈ 0.7) but fail to generalize across corpora, converging to uniform predictions and capturing corpus-specific features instead of language-general proficiency.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
VLMs show answer inertia in CoT reasoning and remain influenced by misleading textual cues even with sufficient visual evidence, making CoT an incomplete window into modality reliance.
LoH adds a learnable choice operator to propositional logic, compiles formulas to differentiable graphs via fuzzy logic, subsumes prior NeSy models, and supports discretization to Boolean functions via the Gödel trick.
Training data for open LLMs is systematically left-leaning, with pre-training corpora containing more political material than post-training data and model stances aligning with data distributions.
citing papers explorer
-
Logic of Hypotheses: from Zero to Full Knowledge in Neurosymbolic Integration
LoH adds a learnable choice operator to propositional logic, compiles formulas to differentiable graphs via fuzzy logic, subsumes prior NeSy models, and supports discretization to Boolean functions via the Gödel trick.
-
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Training data for open LLMs is systematically left-leaning, with pre-training corpora containing more political material than post-training data and model stances aligning with data distributions.