hub

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen · 2020 · cs.CL · arXiv 2006.03654

49 Pith papers cite this work. Polarity classification is still indexing.

49 Pith papers citing it

open full Pith review browse 49 citing papers arXiv PDF

abstract

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with disentangled attention) that improves the BERT and RoBERTa models using two novel techniques. The first is the disentangled attention mechanism, where each word is represented using two vectors that encode its content and position, respectively, and the attention weights among words are computed using disentangled matrices on their contents and relative positions, respectively. Second, an enhanced mask decoder is used to incorporate absolute positions in the decoding layer to predict the masked tokens in model pre-training. In addition, a new virtual adversarial training method is used for fine-tuning to improve models' generalization. We show that these techniques significantly improve the efficiency of model pre-training and the performance of both natural language understanding (NLU) and natural langauge generation (NLG) downstream tasks. Compared to RoBERTa-Large, a DeBERTa model trained on half of the training data performs consistently better on a wide range of NLP tasks, achieving improvements on MNLI by +0.9% (90.2% vs. 91.1%), on SQuAD v2.0 by +2.3% (88.4% vs. 90.7%) and RACE by +3.6% (83.2% vs. 86.8%). Notably, we scale up DeBERTa by training a larger version that consists of 48 Transform layers with 1.5 billion parameters. The significant performance boost makes the single DeBERTa model surpass the human performance on the SuperGLUE benchmark (Wang et al., 2019a) for the first time in terms of macro-average score (89.9 versus 89.8), and the ensemble DeBERTa model sits atop the SuperGLUE leaderboard as of January 6, 2021, out performing the human baseline by a decent margin (90.3 versus 89.8).

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 baseline 1 method 1

citation-polarity summary

background 2 baseline 1 use method 1

representative citing papers

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

cs.CL · 2026-04-30 · accept · novelty 8.0

ViLegalNLI is the first 42k-pair Vietnamese legal NLI dataset built via semi-automatic LLM-assisted generation and validation.

Discovering Latent Knowledge in Language Models Without Supervision

cs.CL · 2022-12-07 · conditional · novelty 8.0

An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.

RoFormer: Enhanced Transformer with Rotary Position Embedding

cs.CL · 2021-04-20 · accept · novelty 8.0

RoFormer introduces rotary position embeddings that encode absolute positions via rotation matrices and relative dependencies in attention, outperforming prior position methods on long text classification tasks.

Semantic Reranking at Inference Time for Hard Examples in Rhetorical Role Labeling

cs.CL · 2026-05-18 · unverdicted · novelty 7.0

RISE is an inference-time semantic reranking framework that refines low-confidence predictions in rhetorical role labeling using contrastively learned label representations, delivering an average +9.15 macro-F1 gain on hard examples across eight datasets and seven models.

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

cs.CL · 2026-05-01 · unverdicted · novelty 7.0

DSR uses transformer models to detect sentiment targets in text and score them along three theory-motivated axes, with validation showing correlations to existing social science datasets.

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners

cs.CL · 2026-04-30 · conditional · novelty 7.0

RSAT uses SFT on verified traces followed by GRPO with NLI faithfulness rewards to make 1-8B models produce verifiable table reasoning with cell citations, raising faithfulness 3.7x to 0.826.

Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER

cs.CL · 2026-04-06 · unverdicted · novelty 7.0

JPT enables bidirectional token classification in causal LLMs for zero-shot NER via input concatenation plus definition-guided embeddings, delivering +7.9 F1 gains and over 20x speedup on benchmarks.

The Indra Representation Hypothesis for Multimodal Alignment

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

Unimodal model representations converge to a relational structure captured by the Indra representation via V-enriched Yoneda embedding, which is unique and structure-preserving and improves cross-model and cross-modal robustness when instantiated with angular distance.

Indirect Question Answering in English, German and Bavarian: A Challenging Task for High- and Low-Resource Languages Alike

cs.CL · 2026-03-16 · unverdicted · novelty 7.0

IQA is a pragmatically difficult task where multilingual models achieve low performance and overfit severely, even for English, and GPT-4o-mini cannot generate high-quality training data for it.

Group Representational Position Encoding

cs.LG · 2025-12-08 · unverdicted · novelty 7.0

GRAPE unifies RoPE and ALiBi as special cases of group actions on positions, providing a principled design space for positional encodings via SO(d) rotations and GL unipotent transformations.

When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA

cs.CV · 2025-11-03 · conditional · novelty 7.0

QA-SNNE adds question-answer alignment via bilateral gating to semantic nearest neighbor entropy, yielding higher AUROC for uncertainty detection in surgical VQA models under both standard and rephrased questions.

Clotho: Measuring Task-Specific Pre-Generation Test Adequacy for LLM Inputs

cs.SE · 2025-09-22 · unverdicted · novelty 7.0

Clotho ranks LLM test inputs by failure likelihood using pre-generation hidden states and GMMs, achieving 0.716 ROC-AUC after labeling 5.4% of inputs on average across eight tasks and three models, with transfer to proprietary models.

GHI: Graphormer over Conditioned Hypergraph Incidence for Aspect-Based Sentiment Analysis

cs.CL · 2026-05-21 · unverdicted · novelty 6.0

GHI introduces an incidence-based structural reasoning layer using Graphormer on conditioned hypergraphs for ABSA, reporting outperformance on SemEval benchmarks, near-parity with 11B models at 247M parameters, and robustness on ARTS.

From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents

cs.CL · 2026-05-14 · unverdicted · novelty 6.0

A dataset-agnostic framework converts text tool-calling benchmarks to paired audio evaluations via TTS, speaker variation and noise, then evaluates seven omni-modal models showing model- and task-dependent performance with small text-to-voice gaps.

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

cs.CR · 2026-05-11 · conditional · novelty 6.0

Generative AI enables scalable, context-aware spear phishing by extracting profiles from public social media, producing emails that outperform real-world phishing samples in personalization and lower recipient suspicion.

An Information-theoretic Propagation Denoising and Fusion Framework for Fake News Detection

cs.CL · 2026-05-04 · unverdicted · novelty 6.0

InfoPDF uses mutual information to suppress noise in LLM-generated synthetic propagation graphs and adaptively fuse them with real data, yielding more discriminative representations for fake news detection.

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

cs.CR · 2026-04-30 · unverdicted · novelty 6.0

TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.

ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models

cs.CL · 2026-04-27 · unverdicted · novelty 6.0

ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the

EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce

cs.CL · 2026-04-27 · unverdicted · novelty 6.0

EPM-RL uses PEFT followed by RL with agent-based rewards from judge models to create a trainable in-house product mapping model that improves on fine-tuning alone and beats API baselines in quality-cost while enabling private use.

Beyond Importance Sampling: Rejection-Gated Policy Optimization

cs.LG · 2026-04-16 · unverdicted · novelty 6.0

RGPO replaces importance sampling with a smooth [0,1] acceptance gate in policy gradients, unifying TRPO/PPO/REINFORCE, bounding variance for heavy-tailed ratios, and showing gains in online RLHF experiments.

RouterWise: Joint Resource Allocation and Routing for Latency-Aware Multi-Model LLM Serving

cs.NI · 2026-04-13 · unverdicted · novelty 6.0 · 2 refs

Joint resource allocation and routing for multi-model LLM serving can produce up to 87% variation in achievable output quality across setups on the same GPU cluster.

Entities as Retrieval Signals: A Systematic Study of Coverage, Supervision, and Evaluation in Entity-Oriented Ranking

cs.IR · 2026-04-06 · conditional · novelty 6.0

Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.

Million Tutoring Moves (MTM): An Open Multimodal Dataset for the Science of Tutoring

cs.CY · 2026-04-03 · accept · novelty 6.0

MTM v1 releases 4,654 open math tutoring transcripts as the first step toward a large-scale multimodal repository for studying and improving tutoring.

Overconfidence and Calibration in Medical VQA: Empirical Findings and Hallucination-Aware Mitigation

cs.CV · 2026-04-02 · conditional · novelty 6.0

Empirical study finds overconfidence persists in medical VLMs despite scaling and prompting; post-hoc calibration reduces error while hallucination-aware calibration improves both calibration and AUROC.

citing papers explorer

Showing 49 of 49 citing papers.

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts cs.CL · 2026-04-30 · accept · none · ref 30 · internal anchor
ViLegalNLI is the first 42k-pair Vietnamese legal NLI dataset built via semi-automatic LLM-assisted generation and validation.
Discovering Latent Knowledge in Language Models Without Supervision cs.CL · 2022-12-07 · conditional · none · ref 10 · internal anchor
An unsupervised technique extracts latent yes-no knowledge from language model activations by locating a direction that satisfies logical consistency properties, outperforming zero-shot accuracy by 4% on average across models and datasets.
RoFormer: Enhanced Transformer with Rotary Position Embedding cs.CL · 2021-04-20 · accept · none · ref 4 · internal anchor
RoFormer introduces rotary position embeddings that encode absolute positions via rotation matrices and relative dependencies in attention, outperforming prior position methods on long text classification tasks.
Semantic Reranking at Inference Time for Hard Examples in Rhetorical Role Labeling cs.CL · 2026-05-18 · unverdicted · none · ref 118 · internal anchor
RISE is an inference-time semantic reranking framework that refines low-confidence predictions in rhetorical role labeling using contrastively learned label representations, delivering an average +9.15 macro-F1 gain on hard examples across eight datasets and seven models.
Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media cs.CL · 2026-05-01 · unverdicted · none · ref 37 · internal anchor
DSR uses transformer models to detect sentiment targets in text and score them along three theory-motivated axes, with validation showing correlations to existing social science datasets.
RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners cs.CL · 2026-04-30 · conditional · none · ref 32 · internal anchor
RSAT uses SFT on verified traces followed by GRPO with NLI faithfulness rewards to make 1-8B models produce verifiable table reasoning with cell citations, raising faithfulness 3.7x to 0.826.
Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER cs.CL · 2026-04-06 · unverdicted · none · ref 1 · internal anchor
JPT enables bidirectional token classification in causal LLMs for zero-shot NER via input concatenation plus definition-guided embeddings, delivering +7.9 F1 gains and over 20x speedup on benchmarks.
The Indra Representation Hypothesis for Multimodal Alignment cs.CV · 2026-04-06 · unverdicted · none · ref 25 · internal anchor
Unimodal model representations converge to a relational structure captured by the Indra representation via V-enriched Yoneda embedding, which is unique and structure-preserving and improves cross-model and cross-modal robustness when instantiated with angular distance.
Indirect Question Answering in English, German and Bavarian: A Challenging Task for High- and Low-Resource Languages Alike cs.CL · 2026-03-16 · unverdicted · none · ref 13 · internal anchor
IQA is a pragmatically difficult task where multilingual models achieve low performance and overfit severely, even for English, and GPT-4o-mini cannot generate high-quality training data for it.
Group Representational Position Encoding cs.LG · 2025-12-08 · unverdicted · none · ref 8 · internal anchor
GRAPE unifies RoPE and ALiBi as special cases of group actions on positions, providing a principled design space for positional encodings via SO(d) rotations and GL unipotent transformations.
When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA cs.CV · 2025-11-03 · conditional · none · ref 18 · internal anchor
QA-SNNE adds question-answer alignment via bilateral gating to semantic nearest neighbor entropy, yielding higher AUROC for uncertainty detection in surgical VQA models under both standard and rephrased questions.
Clotho: Measuring Task-Specific Pre-Generation Test Adequacy for LLM Inputs cs.SE · 2025-09-22 · unverdicted · none · ref 16 · internal anchor
Clotho ranks LLM test inputs by failure likelihood using pre-generation hidden states and GMMs, achieving 0.716 ROC-AUC after labeling 5.4% of inputs on average across eight tasks and three models, with transfer to proprietary models.
GHI: Graphormer over Conditioned Hypergraph Incidence for Aspect-Based Sentiment Analysis cs.CL · 2026-05-21 · unverdicted · none · ref 45 · internal anchor
GHI introduces an incidence-based structural reasoning layer using Graphormer on conditioned hypergraphs for ABSA, reporting outperformance on SemEval benchmarks, near-parity with 11B models at 247M parameters, and robustness on ARTS.
From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents cs.CL · 2026-05-14 · unverdicted · none · ref 13 · internal anchor
A dataset-agnostic framework converts text tool-calling benchmarks to paired audio evaluations via TTS, speaker variation and noise, then evaluates seven omni-modal models showing model- and task-dependent performance with small text-to-voice gaps.
Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data cs.CR · 2026-05-11 · conditional · none · ref 42 · internal anchor
Generative AI enables scalable, context-aware spear phishing by extracting profiles from public social media, producing emails that outperform real-world phishing samples in personalization and lower recipient suspicion.
An Information-theoretic Propagation Denoising and Fusion Framework for Fake News Detection cs.CL · 2026-05-04 · unverdicted · none · ref 12 · internal anchor
InfoPDF uses mutual information to suppress noise in LLM-generated synthetic propagation graphs and adaptively fuse them with real data, yielding more discriminative representations for fake news detection.
TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning cs.CR · 2026-04-30 · unverdicted · none · ref 10 · internal anchor
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 5 · internal anchor
ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the
EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce cs.CL · 2026-04-27 · unverdicted · none · ref 12 · internal anchor
EPM-RL uses PEFT followed by RL with agent-based rewards from judge models to create a trainable in-house product mapping model that improves on fine-tuning alone and beats API baselines in quality-cost while enabling private use.
Beyond Importance Sampling: Rejection-Gated Policy Optimization cs.LG · 2026-04-16 · unverdicted · none · ref 4 · internal anchor
RGPO replaces importance sampling with a smooth [0,1] acceptance gate in policy gradients, unifying TRPO/PPO/REINFORCE, bounding variance for heavy-tailed ratios, and showing gains in online RLHF experiments.
RouterWise: Joint Resource Allocation and Routing for Latency-Aware Multi-Model LLM Serving cs.NI · 2026-04-13 · unverdicted · none · ref 18 · 2 links · internal anchor
Joint resource allocation and routing for multi-model LLM serving can produce up to 87% variation in achievable output quality across setups on the same GPU cluster.
Entities as Retrieval Signals: A Systematic Study of Coverage, Supervision, and Evaluation in Entity-Oriented Ranking cs.IR · 2026-04-06 · conditional · none · ref 7 · internal anchor
Entity signals cover only 19.7% of relevant documents on Robust04 and no configuration among 443 systems improves MAP by more than 0.05 in open-world evaluation, despite gains when entities are pre-restricted.
Million Tutoring Moves (MTM): An Open Multimodal Dataset for the Science of Tutoring cs.CY · 2026-04-03 · accept · none · ref 4 · internal anchor
MTM v1 releases 4,654 open math tutoring transcripts as the first step toward a large-scale multimodal repository for studying and improving tutoring.
Overconfidence and Calibration in Medical VQA: Empirical Findings and Hallucination-Aware Mitigation cs.CV · 2026-04-02 · conditional · none · ref 11 · internal anchor
Empirical study finds overconfidence persists in medical VLMs despite scaling and prompting; post-hoc calibration reduces error while hallucination-aware calibration improves both calibration and AUROC.
From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration cs.MA · 2026-03-04 · unverdicted · none · ref 17 · internal anchor
A graph-based propagation model for error cascades in LLM multi-agent systems plus a genealogy-graph governance plugin that prevents final infection in at least 89% of runs across tested frameworks.
Frozen LVLMs for Micro-Video Recommendation: A Systematic Study of Feature Extraction and Fusion cs.IR · 2025-12-26 · conditional · none · ref 12 · internal anchor
Intermediate decoder hidden states from frozen LVLMs fused with ID embeddings outperform caption representations and deliver state-of-the-art micro-video recommendation performance on two real-world benchmarks.
Interpretability from the Ground Up: Stakeholder-Centric Design of Automated Scoring in Educational Assessments cs.CL · 2025-11-21 · unverdicted · none · ref 21 · internal anchor
AnalyticScore applies new FGTI interpretability principles to text-based scoring and achieves accuracy within 0.06 QWK of uninterpretable state-of-the-art while matching human featurization on the ASAP-SAS dataset.
Positional Encoding via Token-Aware Phase Attention cs.CL · 2025-09-16 · unverdicted · none · ref 7 · internal anchor
TAPA adds a learnable phase function to attention to preserve long-range token interactions, enabling direct continual pretraining, length extrapolation, lower perplexity, and stronger retrieval than RoPE-style methods.
TriagerX: Dual Transformers for Bug Triaging Tasks with Content and Interaction Based Rankings cs.SE · 2025-08-23 · conditional · none · ref 70 · internal anchor
TriagerX combines dual-transformer content rankings with developer interaction history to improve top-k accuracy for developer and component recommendations in bug triaging across five datasets.
LIMO: Less is More for Reasoning cs.CL · 2025-02-05 · unverdicted · none · ref 157 · internal anchor
LIMO achieves 63.3% on AIME24 and 95.6% on MATH500 via supervised fine-tuning on roughly 1% of the data used by prior models, supporting the claim that minimal strategic examples suffice when pre-training has already encoded domain knowledge.
MiniMax-01: Scaling Foundation Models with Lightning Attention cs.CL · 2025-01-14 · unverdicted · none · ref 35 · internal anchor
MiniMax-01 models match GPT-4o and Claude-3.5-Sonnet performance while providing 20-32 times longer context windows through lightning attention and MoE scaling.
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations cs.AI · 2023-12-14 · conditional · none · ref 61 · internal anchor
Math-Shepherd is an automatically trained process reward model that scores solution steps to verify and reinforce LLMs, lifting Mistral-7B from 77.9% to 89.1% on GSM8K and 28.6% to 43.5% on MATH.
Demystifying CLIP Data cs.CV · 2023-09-28 · accept · none · ref 49 · internal anchor
MetaCLIP curates balanced 400M-pair subsets from CommonCrawl that outperform CLIP data, reaching 70.8% zero-shot ImageNet accuracy on ViT-B versus CLIP's 68.3%.
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation cs.CL · 2023-02-19 · unverdicted · none · ref 6 · internal anchor
Semantic entropy improves uncertainty estimation in natural language generation by incorporating semantic equivalences, outperforming standard entropy baselines on predicting model accuracy for question answering.
Ethical and social risks of harm from Language Models cs.CL · 2021-12-08 · accept · none · ref 109 · internal anchor
The authors provide a detailed taxonomy of 21 risks associated with language models, covering discrimination, information leaks, misinformation, malicious applications, interaction harms, and societal impacts like job loss and environmental costs.
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering cs.CL · 2026-05-19 · unverdicted · none · ref 79 · internal anchor
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.
Revisiting Semantic Role Labeling: Efficient Structured Inference with Dependency-Informed Analysis cs.CL · 2026-05-04 · unverdicted · none · ref 49 · internal anchor
A new encoder-based SRL system with dependency-informed analysis delivers 10x faster inference and comparable or better F1 scores using BERT, RoBERTa, and DeBERTa while supporting multilingual projection.
VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering cs.IR · 2026-01-16 · unverdicted · none · ref 56 · internal anchor
VerifAI is an open-source biomedical QA system that decomposes generated answers into claims and verifies them with a fine-tuned NLI engine to reduce hallucinations and provide traceable citations.
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning cs.LG · 2025-05-16 · unverdicted · none · ref 16 · internal anchor
TokUR estimates token-level uncertainty via low-rank weight perturbations in LLMs, aggregates signals to correlate with correctness, and uses them to improve reasoning performance on math tasks.
Toward General and Robust LLM-enhanced Text-attributed Graph Learning cs.LG · 2025-04-03 · unverdicted · none · ref 6 · internal anchor
UltraTAG organizes LLM-GNN methods for text-attributed graphs; UltraTAG-S adds LLM text propagation, augmentation, PageRank node selection, and edge reconfiguration to improve robustness on sparse data, with reported gains of 2.12% and 17.47%.
Semantic Embeddings of Chemical Elements for Enhanced Materials Inference and Discovery cs.CL · 2025-02-19 · unverdicted · none · ref 36 · internal anchor
ElementBERT generates literature-derived semantic embeddings for chemical elements that outperform empirical descriptors in alloy property prediction and optimization tasks with up to 23% accuracy gains.
MIPIAD: Multilingual Indirect Prompt Injection Attack Defense with Qwen -- TF-IDF Hybrid and Meta-Ensemble Learning cs.CL · 2026-05-08 · unverdicted · none · ref 4 · internal anchor
MIPIAD reports a hybrid Qwen-TF-IDF ensemble defense that reaches F1 0.9205 and reduces the English-Bangla performance gap on a 1.43-million-sample synthetic benchmark derived from BIPIA templates.
BiMind: A Dual-Head Reasoning Model with Attention-Geometry Adapter for Incorrect Information Detection cs.CL · 2026-04-07 · unverdicted · none · ref 4 · internal anchor
BiMind outperforms existing methods in incorrect information detection by disentangling content and knowledge reasoning with attention geometry adaptation and self-retrieval.
Attribution-Driven Explainable Intrusion Detection with Encoder-Based Large Language Models cs.CR · 2026-04-07 · unverdicted · none · ref 41 · internal anchor
Encoder-based LLMs detect SDN intrusions with decisions driven by meaningful traffic behaviors, as validated by attribution analysis aligning with established intrusion principles.
LLMs Struggle with Abstract Meaning Comprehension More Than Expected cs.CL · 2026-04-13 · unverdicted · none · ref 11 · internal anchor
LLMs struggle with abstract meaning comprehension on SemEval-2021 Task 4 more than fine-tuned models, and a new bidirectional attention classifier yields small accuracy gains of 3-4%.
Predicting User Satisfaction in Online Education Platforms: A Large Language Model Based Multi-Modal Review Mining Framework cs.GR · 2026-04-13 · unverdicted · none · ref 7 · internal anchor
An LLM multi-modal system integrates topic modeling, transformer sentiment, and behavioral features to predict MOOC learner satisfaction more accurately than single-modality baselines.
Large Language Models: A Survey cs.CL · 2024-02-09 · accept · none · ref 26 · internal anchor
The paper surveys key large language models, their training methods, datasets, evaluation benchmarks, and future research directions in the field.
Bridging Language Models and Financial Analysis q-fin.ST · 2025-03-14 · unverdicted · none · ref 37 · internal anchor
A survey synthesizing recent LLM research and assessing its applicability to financial data analysis.
Findings of the Counter Turing Test: AI-Generated Text Detection cs.CL · 2026-05-20 · unreviewed · ref 22 · internal anchor

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer