HypCBM reformulates concept activations as geometric containment in hyperbolic space to produce sparse, hierarchy-aware signals that match Euclidean models trained on 20 times more data.
hub
Wordnet: a lexical database for english
19 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
S²R² improves robustness of LoRA-tuned LLMs to prompt perturbations by penalizing semantic-segment drift while preserving clean performance and cross-dataset transfer.
DualGuard uses adaptive dual-stream watermark signals to detect and trace both paraphrase and spoofing attacks in LLM outputs while preserving text quality.
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
ToxiREX is a new dataset of 128k Reddit comments in six languages with hierarchical annotations for implicit toxicity in conversational context based on an existing reasoning schema.
Develops ACW-based semantic timescale features showing longer autocorrelation windows associate with generic vocabulary and shorter ones with specific words in both human and LLM speech, with the pattern abolished by randomizing word order and timing.
Semantic Histograms treat semantic image filters as implicit range queries in embedding space and use two specificity estimators whose ensemble reduces end-to-end query optimization and execution overhead by up to 86%.
LLMs prompted with few-shot examples and rationales generate better reasoned distractors for MCQs than fine-tuned contrastive models across six benchmarks.
EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.
Word2Vec on Toki Pona shows distributional patterns suffice for semantic structure even at extreme vocabulary reduction, and incidental non-core tokens tighten rather than disrupt clusters.
Authors release the multimodal WJoconde knowledge graph for French cultural heritage and a LLM-VLM pipeline that extracts and validates new triples from unstructured text and images to extend the graph.
Frozen DINOv2-L features with k-NN classification and PCA/ICA refinement achieve state-of-the-art few-shot performance on four benchmarks without any backpropagation or fine-tuning.
Gradient-boosted models with SHAP analysis find word familiarity as the dominant predictor of English vocabulary difficulty across Spanish, German, and Chinese L1 learners, with orthographic transfer adding value only for the first two groups.
Qwen-Image is a foundation model that reaches state-of-the-art results in image generation and editing by combining a large-scale text-focused data pipeline with curriculum learning and dual semantic-reconstructive encoding for editing consistency.
Open-weight instruction-aware encoders capture equal or greater affective information than proprietary models at word level across emotion theories, while task-tuned and proprietary encoders perform best on sentence-level classification.
Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.
Gyan is a novel explainable non-transformer language model that achieves SOTA results on multiple datasets by mimicking human-like compositional context and world models.
Systematic experiments show that text decomposition methods and privacy budget allocation strategies produce significantly different privacy-utility trade-offs even under comparable total epsilon budgets.
A symbolic system extracts events from 450 property crime reports, with 54.1% high-confidence outputs, 93.7% mapped via PropBank-VerbNet-WordNet, and 100% human agreement on incident initiation, stolen items, and temporal cues.
citing papers explorer
-
Language Models as Knowledge Bases?
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.