A matched benchmark shows GUI computer-use agents at 59.1% full pass rate versus 48.2% for original-skill CLI agents, rising to 69.3% with verifier-guided augmentation, indicating modality-specific execution bottlenecks.
hub
Multi- BERT : Leveraging Adapters for Low-Resource Multi-Domain Adaptation
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
years
2026 12representative citing papers
Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.
Lexical richness is a robust linguistic signal for AI-generated text detection across models and domains, while most other features are context-dependent.
Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.
Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.
LLM-generated ML pipelines show higher bias (87.7% sensitive attributes) than conditional statements (59.2%), indicating that simple if-statement tests underestimate bias risk in practical code generation.
Introduces LLM Consumer Behavior Theory to analyze consumer behavior when LLMs serve as autonomous decision-making agents in markets.
A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.
Finetuning Qwen3-32B with data augmentation and self-training achieves competitive 8th-place ranking on SemEval-2026 conspiracy detection.
Finetuning LLMs with QLoRA and multilingual data augmentation for polarization detection, type, and manifestation in SemEval-2026 Task 9.
Fine-tuning LLMs by adapting the mdok approach produces competitive results on binary detection, source attribution, and hybrid/adversarial code identification in SemEval-2026 Task 13.
citing papers explorer
-
Misaligned by Reward: Socially Undesirable Preferences in LLMs
Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.
-
A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models
Lexical richness is a robust linguistic signal for AI-generated text detection across models and domains, while most other features are context-dependent.
-
Modular Monolingual Adaptation using Pretrained Language Models
Replacing tokens, freezing the corresponding embeddings, and tuning the rest of the model improves NLU performance on low-resource languages compared to full fine-tuning.
-
Why Low-Resource NLP Needs More Than Cross-Lingual Transfer: Lessons Learned from Luxembourgish
Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.
-
From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation
LLM-generated ML pipelines show higher bias (87.7% sensitive attributes) than conditional statements (59.2%), indicating that simple if-statement tests underestimate bias risk in practical code generation.
-
FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals
A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.
-
mdok-style at SemEval-2026 Task 10: Finetuning LLMs for Conspiracy Detection
Finetuning Qwen3-32B with data augmentation and self-training achieves competitive 8th-place ranking on SemEval-2026 conspiracy detection.
-
mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection
Finetuning LLMs with QLoRA and multilingual data augmentation for polarization detection, type, and manifestation in SemEval-2026 Task 9.
- Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison-Eliciting Posts They Fail to Detect