super hub Mixed citations

Title resolution pending

Mistral 7B · 2023 · cs.CL · arXiv 2310.06825

Mixed citation behavior. Most common role is background (61%).

576 Pith papers citing it

Background 61% of classified citations

open full Pith review browse 576 citing papers more from Mistral 7B arXiv PDF

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

abstract

We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B -- Instruct, that surpasses the Llama 2 13B -- Chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 57 method 15 baseline 10 other 6 dataset 2

citation-polarity summary

background 55 use method 15 baseline 10 unclear 8 use dataset 2

claims ledger

abstract We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B -- Instruct, that surpasses the Llama 2 13B -- Chat model both on human and auto

authors

author = Mistral 7B

co-cited works

representative citing papers

CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges

cs.CL · 2026-06-18 · unverdicted · novelty 8.0

Presents a new expert-curated dataset of multi-turn counterspeech dialogues in five languages targeting hate against seven groups, with span annotations linking to verified external knowledge for RAG applications.

Entropy-Gated Latent Recursion

cs.LG · 2026-06-15 · unverdicted · novelty 8.0 · 2 refs

EGLR adds a deterministic layer-recursion axis gated by entropy that is complementary to temperature sampling, raising joint oracle accuracy on MATH-500 from 83.4% to 91.6% for a 3B model.

Do Models Share Safety Representations? Cross-Model Steering for Safe Visual Generation

cs.CV · 2026-06-03 · unverdicted · novelty 8.0

A safety direction estimated in a source LLM is transported to a target generator through lightweight alignment on benign data alone, matching native safety performance without any target-side unsafe data.

Faithfulness Metrics Don't Measure Faithfulness: A Meta-Evaluation with Ground Truth

cs.CL · 2026-05-24 · unverdicted · novelty 8.0

Introduces BonaFide benchmark of 3,066 ground-truth labeled CoTs showing most faithfulness metrics perform near chance with biases and poor scaling to longer chains.

RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

cs.CL · 2026-05-16 · accept · novelty 8.0

RTI-Bench is the first publicly released structured dataset of CIC administrative decisions with outcome labels, exemption citations, IRAC reasoning, and timelines, built from 1,218 corpus cases and 298 PDFs, achieving 95.3% label precision on manual review and 57.3% accuracy on a Mistral 7B zero-Sh

Privacy Auditing with Zero (0) Training Run

cs.CR · 2026-05-14 · unverdicted · novelty 8.0

Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.

Crafting Reversible SFT Behaviors in Large Language Models

cs.LG · 2026-05-07 · unverdicted · novelty 8.0

LCDD creates sparse carriers for SFT behaviors that SFT-Eraser can reverse, with ablations showing the sparse structure enables causal control.

DurableUn: Quantization-Induced Recovery Attacks in Machine Unlearning

cs.LG · 2026-05-04 · conditional · novelty 8.0 · 2 refs

INT4 quantization recovers up to 22 times more forgotten training data in unlearned LLMs, and the proposed DURABLEUN-SAF method is the first to maintain forgetting across BF16, INT8, and INT4 precisions.

Backdoor Attacks on Decentralised Post-Training

cs.CR · 2026-03-31 · conditional · novelty 8.0

An adversary controlling an intermediate pipeline stage in decentralized LLM post-training can inject a backdoor that reduces alignment from 80% to 6%, with the backdoor persisting in 60% of cases even after subsequent safety training.

CacheTrap: Unveiling a Stealthier Gray-Box Trojan against LLMs

cs.CR · 2025-11-27 · conditional · novelty 8.0

CacheTrap achieves 100% targeted attack success on five open-source LLMs by using an efficient search to locate and flip a single bit in the KV cache as a transient trigger, while preserving normal accuracy without the trigger.

MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation

cs.CL · 2025-07-28 · accept · novelty 8.0

MediQAl is a new French medical QA benchmark with 32k exam-sourced questions in three formats and cognitive labels, evaluated on 14 LLMs to reveal gaps between factual recall and reasoning performance.

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

cs.CV · 2024-08-23 · conditional · novelty 8.0

MME-RealWorld is the largest manually annotated high-resolution benchmark for MLLMs, where even the best models achieve less than 60% accuracy on challenging real-world tasks.

LiveBench: A Challenging, Contamination-Limited LLM Benchmark

cs.CL · 2024-06-27 · unverdicted · novelty 8.0

LiveBench is a contamination-limited LLM benchmark with auto-scored challenging tasks from recent sources across math, coding, reasoning and more, where top models score below 70%.

ORPO: Monolithic Preference Optimization without Reference Model

cs.CL · 2024-03-12 · conditional · novelty 8.0

ORPO performs preference alignment during supervised fine-tuning via a monolithic odds ratio penalty, allowing 7B models to outperform larger state-of-the-art models on alignment benchmarks.

Evaluating Very Long-Term Conversational Memory of LLM Agents

cs.CL · 2024-02-27 · unverdicted · novelty 8.0

Creates LoCoMo benchmark dataset for very long-term LLM conversational memory and shows current models struggle with lengthy dialogues and long-range temporal dynamics.

Information Dynamics of Language Communication

cs.CL · 2026-06-29 · unverdicted · novelty 7.0

The paper defines STE and SPID, two information-theoretic measures of semantic flow and decomposition in language exchanges, and applies them to four dialogue datasets.

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

cs.CL · 2026-06-28 · conditional · novelty 7.0

Anisotropy, quantified by dominant-dimension variance fraction, determines the best parameter-free similarity metric for text embeddings, with rank-based metrics gaining ~20% relative where cosine is weakest.

MultiHashFormer: Hash-based Generative Language Models

cs.CL · 2026-06-26 · unverdicted · novelty 7.0

MultiHashFormer enables hash-based autoregression in LMs by encoding tokens as multi-hash signatures, outperforming standard Transformers at 100M-3B scales while keeping parameter count constant for multilingual expansion.

Text-to-Image Models Need Less from Text Encoders Than You Think

cs.CV · 2026-06-02 · unverdicted · novelty 7.0

A bag-of-position-tagged-words embedding guides text-to-image diffusion models as effectively as full contextual text embeddings from standard encoders.

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

cs.AI · 2026-06-02 · unverdicted · novelty 7.0

ClinicalMC is a benchmark of 1,275 Chinese and 5,804 English multi-course clinical samples across four stages, evaluated via a multi-agent framework on closed-source, open-source, and medical LLMs in static and dynamic settings.

EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction

cs.CL · 2026-06-02 · unverdicted · novelty 7.0

Introduces EURO-5K dataset from 136 EU acts and benchmarks full fine-tuning vs QLoRA for BERT and LLM models on reporting obligation extraction, reporting 0.89 F1 with limited gains from legal pretraining except under parameter-efficient adaptation.

Representational Capacity: Geometric Limits on Feature Representation in Transformer Language Models

cs.LG · 2026-06-01 · unverdicted · novelty 7.0

Defines representational capacity as the upper bound on distinguishable near-orthogonal directions in transformer latent spaces, derived from embedding similarity distributions and an adjusted Johnson-Lindenstrauss formula dependent on the k/d ratio.

Not All Errors Are Equal: A Systematic Study of Error Propagation in Large Language Model Inference

cs.DC · 2026-06-01 · unverdicted · novelty 7.0

A new fault-injection framework enables a systematic empirical study that produces 17 takeaways on error propagation in LLM inference and four software-only mitigation directions.

citing papers explorer

Showing 50 of 91 citing papers after filters.

CacheTrap: Unveiling a Stealthier Gray-Box Trojan against LLMs cs.CR · 2025-11-27 · conditional · none · ref 43 · internal anchor
CacheTrap achieves 100% targeted attack success on five open-source LLMs by using an efficient search to locate and flip a single bit in the KV cache as a transient trigger, while preserving normal accuracy without the trigger.
MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation cs.CL · 2025-07-28 · accept · none · ref 9 · internal anchor
MediQAl is a new French medical QA benchmark with 32k exam-sourced questions in three formats and cognitive labels, evaluated on 14 LLMs to reveal gaps between factual recall and reasoning performance.
Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 34 · internal anchor
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Exact Flow Linear Attention: Exact Solution from Continuous-Time Dynamics cs.LG · 2025-12-14 · unverdicted · none · ref 13 · internal anchor
Exact Flow Linear Attention derives a closed-form exact update for delta-rule linear attention from continuous-time dynamics, removing Euler discretization error while preserving linear complexity and structure.
PIAST: Rapid Prompting with In-context Augmentation for Scarce Training data cs.CL · 2025-12-11 · conditional · none · ref 2 · internal anchor
PIAST iteratively optimizes few-shot examples in prompts via Monte Carlo Shapley value estimation, outperforming prior automatic prompting methods and setting new SOTA on classification, simplification, and GSM8K with modest compute.
When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation cs.LG · 2025-12-09 · conditional · none · ref 27 · internal anchor
LLM tabular generators leak memorized numeric strings, allowing a no-box attack to achieve near-perfect membership inference on some state-of-the-art models.
TSVer: A Benchmark for Fact Verification Against Time-Series Evidence cs.CL · 2025-11-02 · unverdicted · none · ref 32 · internal anchor
TSVer is a new benchmark dataset for fact verification against time-series evidence, with 304 annotated real-world claims, 400 time series, verdicts, and justifications, plus baseline results showing current models struggle.
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling cs.AI · 2025-10-16 · unverdicted · none · ref 14 · internal anchor
ToolPRM provides fine-grained intra-call process supervision via a new dataset and reward model, outperforming outcome and coarse-grained alternatives on function-calling benchmarks.
When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning cs.AI · 2025-10-08 · unverdicted · none · ref 25 · internal anchor
Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.
Activation Steering with a Feedback Controller cs.LG · 2025-10-05 · unverdicted · none · ref 10 · internal anchor
Popular LLM activation steering methods are shown to act as proportional controllers; a PID steering framework is proposed that improves robustness and outperforms baselines in experiments across model families.
Aligning Language Models with Real-time Knowledge Editing cs.CL · 2025-08-02 · unverdicted · none · ref 6 · internal anchor
Presents CRAFT dataset and KEDAS paradigm for real-time knowledge editing, claiming better balanced performance on dynamic and static benchmarks than prior methods.
Soft Head Selection for Injecting ICL-Derived Task Embeddings cs.CL · 2025-07-28 · conditional · none · ref 8 · internal anchor
SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B parameters.
PRIMETIME : Limits of LLMs in Temporal Primitives cs.NE · 2025-04-22 · unverdicted · none · ref 93 · internal anchor
PRIMETIME generator reveals that LLM datetime parsing and arithmetic primitives are individually unreliable but fully learnable via fine-tuning, enabling frontier-level accuracy on event planning with small LoRA models.
Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature? cs.CL · 2025-02-11 · unverdicted · none · ref 31 · internal anchor
Evaluation of 22 LLMs shows they are more susceptible to spin in medical abstracts than humans but can recognize and mitigate it when prompted.
Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression cs.CL · 2025-02-04 · unverdicted · none · ref 44 · internal anchor
KV cache compression causes task-dependent degradation in high-density reasoning due to disrupted CoT links; ShotKV mitigates this by preserving few-shot examples as indivisible semantic units through phase separation, delivering 9-18% accuracy gains and 11% latency reduction.
Improving LLM Unlearning Robustness via Random Perturbations cs.CL · 2025-01-31 · unverdicted · none · ref 15 · internal anchor
LLM unlearning is reframed as inadvertently installing backdoor triggers on forget-tokens; Random Noise Augmentation is introduced as a defense that improves robustness with theoretical guarantees.
BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations cs.IR · 2025-12-15 · unverdicted · none · ref 28 · internal anchor
BlossomRec is a sparse attention mechanism that uses two distinct block-level patterns for long-term and short-term interests, fused by a gated output, to reduce computation in sequential recommendation Transformers.
TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation cs.LG · 2025-11-27 · unverdicted · none · ref 25 · internal anchor
TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and SQL-Spider.
PEFT-Bench: A Parameter-Efficient Fine-Tuning Methods Benchmark cs.CL · 2025-11-26 · unverdicted · none · ref 25 · internal anchor
PEFT-Bench is a standardized end-to-end benchmark for 7 PEFT methods across 27 NLP datasets on autoregressive LLMs, accompanied by the PSCP metric that penalizes based on trainable parameters, inference speed, and training memory.
The Impact of Off-Policy Training Data on Probe Generalisation cs.AI · 2025-11-21 · unverdicted · none · ref 18 · internal anchor
Off-policy training data for LLM behavior probes causes significant generalization failures especially for intent-based behaviors like deception, and performance on coerced incentivised data correlates with real on-policy success.
Uncovering and Aligning Anomalous Attention Heads to Defend Against NLP Backdoor Attacks cs.CR · 2025-11-16 · unverdicted · none · ref 1 · internal anchor
Backdoor defense for LLMs detects anomalous attention-head similarity on triggers and applies head-wise alignment via fine-tuning to reduce attack success.
P3-LLM: An Integrated NPU-PIM Accelerator for Edge LLM Inference Using Hybrid Numerical Formats cs.AR · 2025-11-10 · unverdicted · none · ref 34 · internal anchor
P3-LLM delivers 4.9x average speedup over HBM-PIM for edge LLM inference by pairing hybrid-format quantization with iso-area-optimized low-precision PIM compute units and operator fusion.
Graph-Based Alternatives to LLMs for Human Simulation cs.CL · 2025-11-03 · conditional · none · ref 36 · internal anchor
GEMS formulates close-ended human-behavior simulation as link prediction on a heterogeneous graph and matches or exceeds LLM performance with three orders of magnitude fewer parameters across three datasets and three evaluation settings.
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling cs.CV · 2025-10-23 · unverdicted · none · ref 28 · internal anchor
RAPO++ is a three-stage prompt optimization framework combining retrieval-augmented refinement, closed-loop test-time scaling, and LLM fine-tuning to enhance text-to-video generation quality.
MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding cs.CL · 2025-10-09 · unverdicted · none · ref 51 · internal anchor
MOSAIC is a training-free multi-agent LLM framework with rationale, coding, reflection, and debugging agents plus a consolidated context window that outperforms prior methods on scientific coding benchmarks.
Red-Bandit: Test-Time Adaptation for LLM Red-Teaming via Bandit-Guided LoRA Experts cs.CL · 2025-10-08 · unverdicted · none · ref 34 · internal anchor
Red-Bandit adapts online to LLM failure modes by dynamically selecting among RL-trained LoRA attack-style experts via a bandit policy, reporting SOTA ASR@10 on AdvBench with lower-perplexity prompts.
Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models cs.CL · 2025-10-04 · unverdicted · none · ref 25 · internal anchor
Curtailing diversity in candidate pools for test-time scaling increases unsafe LLM outputs, as demonstrated by a reference-guided reduction protocol that evades standard safety classifiers across open and closed models.
EpiCache: Episodic KV Cache Management for Long-Term Conversation on Resource-Constrained Environments cs.CL · 2025-09-22 · unverdicted · none · ref 16 · internal anchor
EpiCache clusters long conversation history into coherent episodes for per-episode KV cache eviction, delivering up to 30% accuracy gains and 3.7x peak memory reduction on LongConvQA tasks under fixed budgets.
ScaleDoc: Scaling LLM-based Predicates over Large Document Collections cs.DB · 2025-09-16 · unverdicted · none · ref 14 · internal anchor
ScaleDoc achieves over 2x end-to-end speedup and up to 85% fewer LLM invocations for semantic predicates on large document collections via offline LLM representations, contrastive-trained proxy filtering, and adaptive cascades.
SpikingBrain: Spiking Brain-inspired Large Models cs.LG · 2025-09-05 · unverdicted · none · ref 16 · internal anchor
SpikingBrain-7B and SpikingBrain-76B achieve Transformer-comparable performance after continual pre-training on 150B tokens, with over 100x TTFT speedup on 4M-token sequences and 69.15% sparsity from event-driven spiking.
Principled Detection of Hallucinations in Large Language Models via Multiple Testing cs.CL · 2025-08-25 · unverdicted · none · ref 6 · internal anchor
The method aggregates multiple hallucination evaluation scores via conformal p-values to enable calibrated detection with controlled false alarm rates across LLMs and datasets.
ORFuzz: Fuzzing the "Other Side" of LLM Safety -- Testing Over-Refusal cs.SE · 2025-08-15 · unverdicted · none · ref 25 · internal anchor
ORFuzz presents the first evolutionary testing framework for LLM over-refusal together with a new benchmark of 1,855 cases that triggers over-refusal at 63.56% average across ten models.
Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token cs.CL · 2025-07-31 · conditional · none · ref 41 · internal anchor
Causal2Vec prepends a BERT-generated contextual token to decoder-only LLMs and pools its hidden state with the EOS token to reach new SOTA on MTEB among public-data-trained embedding models.
Accelerating Prefilling via Decoding-time Contribution Sparsity cs.CL · 2025-07-29 · conditional · none · ref 9 · internal anchor
TriangleMix exploits decoding-time contribution sparsity via a training-free static attention pattern to accelerate LLM prefilling with nearly lossless performance.
ReGATE: Learning Faster and Better with Fewer Tokens in MLLMs cs.CV · 2025-07-29 · unverdicted · none · ref 16 · internal anchor
ReGATE introduces a teacher-student adaptive token elision method that reduces training tokens to 38% while matching or exceeding baseline accuracy on multimodal benchmarks.
SessionIntentBench: A Multi-task Inter-session Intention-shift Modeling Benchmark for E-commerce Customer Behavior Understanding cs.CL · 2025-07-27 · unverdicted · none · ref 13 · internal anchor
SessionIntentBench is a large-scale multimodal benchmark for inter-session intention-shift modeling in e-commerce, with 1.95M intention entries and human-annotated gold labels showing current L(V)LMs struggle but improve when intention is injected.
Lizard: An Efficient Linearization Framework for Large Language Models cs.CL · 2025-07-11 · unverdicted · none · ref 11 · internal anchor
Lizard linearizes Transformer LLMs via subquadratic attention and adaptive learnable modules, recovering near-original performance while outperforming prior linearization methods on MMLU and associative recall.
Should We Still Pretrain Encoders with Masked Language Modeling? cs.CL · 2025-07-01 · accept · none · ref 18 · internal anchor
Controlled ablations of 38 models find MLM superior to CLM on representation benchmarks while CLM offers better data efficiency and stability; a biphasic CLM-then-MLM schedule is optimal under fixed compute and improves when initialized from pretrained CLM models.
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning cs.AI · 2025-07-01 · conditional · none · ref 295 · internal anchor
Math reasoning gains in LLMs rarely transfer to general domains; RL tuning generalizes while SFT causes forgetting and representation drift.
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning cs.CL · 2025-06-23 · unverdicted · none · ref 15 · internal anchor
LongWriter-Zero applies RL from a base model with specialized rewards for length, quality, and structure to outperform SFT baselines and larger models on long-writing benchmarks.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity cs.AI · 2025-06-07 · unverdicted · none · ref 10 · internal anchor
LRMs exhibit complete accuracy collapse beyond certain puzzle complexities, with reasoning effort rising then declining, outperforming standard LLMs only on medium-complexity tasks.
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics cs.LG · 2025-06-02 · unverdicted · none · ref 22 · internal anchor
SmolVLA is a small efficient VLA model that achieves performance comparable to 10x larger models while training on one GPU and deploying on consumer hardware via community data and chunked asynchronous action prediction.
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment cs.LG · 2025-05-30 · unverdicted · none · ref 18 · internal anchor
Disentangled Safety Adapters decouple safety computations from task-optimized LLMs via lightweight adapters, yielding up to 53% better AUC on safety tasks and dynamic inference-time alignment with reduced performance trade-offs.
Real-World Doctor Agent with Proactive Consultation through Multi-Agent Reinforcement Learning cs.CL · 2025-05-26 · unverdicted · none · ref 44 · internal anchor
DoctorAgent-RL trains a Qwen2.5-7B doctor agent via multi-agent RL on the new MTMedDialog dataset to conduct dynamic, question-driven consultations, reaching 70% exact diagnostic match in real-patient trials.
Phonetic Perturbations Reveal Tokenizer-Rooted Safety Gaps in LLMs cs.CL · 2025-05-20 · unverdicted · none · ref 15 · internal anchor
Phonetic perturbations fragment safety-critical tokens in LLMs, suppressing attribution scores while preserving input understanding and causing safety mechanisms to fail despite good comprehension.
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs cs.CR · 2025-04-08 · unverdicted · none · ref 36 · internal anchor
ShadowCoT introduces a reasoning-level backdoor attack on LLMs achieving 94.4% attack success rate and 88.4% hijacking success rate with 0.15% parameter updates via internal state conditioning and reasoning chain pollution.
A Study of LLMs' Preferences for Libraries and Programming Languages cs.SE · 2025-03-21 · unverdicted · none · ref 32 · internal anchor
Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.
WiseMind: a knowledge-guided multi-agent framework for accurate and empathetic psychiatric diagnosis cs.AI · 2025-02-28 · unverdicted · none · ref 21 · internal anchor
WiseMind is a dual-agent LLM system with DSM-5 knowledge graph guidance that reaches 85.6% top-1 diagnostic accuracy on simulated and real psychiatric conversations while producing supportive responses.
Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models cs.CL · 2025-02-20 · unverdicted · none · ref 26 · internal anchor
Adapts multi-layer token-level Mahalanobis distance with supervised linear regression to yield improved uncertainty scores for LLM truthfulness tasks.
Estimating Commonsense Plausibility through Semantic Shifts cs.CL · 2025-02-19 · unverdicted · none · ref 3 · internal anchor
ComPaSS estimates commonsense plausibility by quantifying semantic shifts induced by augmenting sentences with related information and outperforms generative baselines on fine-grained tasks for language and vision-language models.

Title resolution pending

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer