hub

Wordnet: a lexical database for english

Miller, G · 1995 · arXiv 9717.219748

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 dataset 1

citation-polarity summary

background 1 unclear 1 use dataset 1

representative citing papers

Hyperbolic Concept Bottleneck Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

HypCBM reformulates concept activations as geometric containment in hyperbolic space to produce sparse, hierarchy-aware signals that match Euclidean models trained on 20 times more data.

Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

S²R² improves robustness of LoRA-tuned LLMs to prompt perturbations by penalizing semantic-segment drift while preserving clean performance and cross-dataset transfer.

DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack

cs.CR · 2025-12-18 · unverdicted · novelty 7.0

DualGuard uses adaptive dual-stream watermark signals to detect and trace both paraphrase and spoofing attacks in LLM outputs while preserving text quality.

Language Models as Knowledge Bases?

cs.CL · 2019-09-03 · accept · novelty 7.0

BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.

ToxiREX: A Dataset on Toxic REasoning in ConteXt

cs.CL · 2026-06-26 · unverdicted · novelty 6.0

ToxiREX is a new dataset of 128k Reddit comments in six languages with hierarchical annotations for implicit toxicity in conversational context based on an existing reasoning schema.

The Dynamics of Human and AI-Generated Language: How Semantics Fluctuates across Different Timescales

cs.CL · 2026-06-09 · unverdicted · novelty 6.0

Develops ACW-based semantic timescale features showing longer autocorrelation windows associate with generic vocabulary and shorter ones with specific words in both human and LLM speech, with the pattern abolished by randomizing word order and timing.

Selectivity Estimation for Semantic Filters on Image Data

cs.DB · 2026-06-03 · unverdicted · novelty 6.0

Semantic Histograms treat semantic image filters as implicit range queries in embedding space and use two specificity estimators whose ensemble reduces end-to-end query optimization and execution overhead by up to 86%.

Beyond Fine-Tuning: In-Context Learning and Chain-of-Thought for Reasoned Distractor Generation

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

LLMs prompted with few-shot examples and rationales generate better reasoned distractors for MCQs than fine-tuned contrastive models across six benchmarks.

EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge

cs.CL · 2025-07-04 · accept · novelty 6.0

EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.

Examining the Limits of Word2Vec with Toki Pona

cs.CL · 2026-06-15 · unverdicted · novelty 5.0

Word2Vec on Toki Pona shows distributional patterns suffice for semantic structure even at extreme vocabulary reduction, and incidental non-core tokens tighten rather than disrupt clusters.

Multimodal Cultural Heritage Knowledge Graph Extension with Language and Vision Models

cs.AI · 2026-05-17 · unverdicted · novelty 5.0

Authors release the multimodal WJoconde knowledge graph for French cultural heritage and a LLM-VLM pipeline that extracts and validates new triples from unstructured text and images to extend the graph.

Rethinking the Good Enough Embedding for Easy Few-Shot Learning

cs.CV · 2026-05-13 · conditional · novelty 5.0

Frozen DINOv2-L features with k-NN classification and PCA/ICA refinement achieve state-of-the-art few-shot performance on four benchmarks without any backpropagation or fine-tuning.

What makes a word hard to learn? Modeling L1 influence on English vocabulary difficulty

cs.CL · 2026-05-12 · unverdicted · novelty 5.0

Gradient-boosted models with SHAP analysis find word familiarity as the dominant predictor of English vocabulary difficulty across Spanish, German, and Chinese L1 learners, with orthographic transfer adding value only for the first two groups.

Qwen-Image Technical Report

cs.CV · 2025-08-04 · unverdicted · novelty 5.0

Qwen-Image is a foundation model that reaches state-of-the-art results in image generation and editing by combining a large-scale text-focused data pipeline with curriculum learning and dual semantic-reconstructive encoding for editing consistency.

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

cs.CL · 2026-06-27 · unverdicted · novelty 4.0

Open-weight instruction-aware encoders capture equal or greater affective information than proprietary models at word level across emotion theories, while task-tuned and proprietary encoders perform best on sentence-level classification.

Modular Monolingual Adaptation using Pretrained Language Models

cs.CL · 2026-06-04 · unverdicted · novelty 4.0

Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.

Gyan: An Explainable Neuro-Symbolic Language Model

cs.CL · 2026-05-06 · unverdicted · novelty 4.0

Gyan is a novel explainable non-transformer language model that achieves SOTA results on multiple datasets by mimicking human-like compositional context and world models.

A Systematic Exploration of Text Decomposition and Budget Distribution in Differentially Private Text Obfuscation

cs.CL · 2026-05-01 · unverdicted · novelty 4.0

Systematic experiments show that text decomposition methods and privacy budget allocation strategies produce significantly different privacy-utility trade-offs even under comparable total epsilon budgets.

Ontology for Policing: Conceptual Knowledge Learning for Semantic Understanding and Reasoning in Law Enforcement Reports

cs.CL · 2026-05-15 · unverdicted · novelty 3.0

A symbolic system extracts events from 450 property crime reports, with 54.1% high-confidence outputs, 93.7% mapped via PropBank-VerbNet-WordNet, and 100% human agreement on incident initiation, stolen items, and temporal cues.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Language Models as Knowledge Bases? cs.CL · 2019-09-03 · accept · none · ref 294
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.

Wordnet: a lexical database for english

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer