Sparse autoencoders inserted into VLMs and trained only for reconstruction can reliably detect adversarial attacks on images, including unseen domains and attack types.
Title resolution pending
19 Pith papers cite this work. Polarity classification is still indexing.
years
2026 19verdicts
UNVERDICTED 19representative citing papers
CCTVBench exposes a large gap between standard QA accuracy and contrastive consistency in traffic video reasoning for multimodal LLMs and introduces C-TCD to narrow that gap.
Retrieving structured thinking traces as a corpus improves reasoning performance on AIME, LiveCodeBench, and GPQA over standard RAG or no retrieval.
VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.
GoLongRL releases a 23K-sample open long-context RL dataset spanning 9 tasks and introduces TMN-Reweight to improve multitask optimization, achieving performance comparable to much larger models under GRPO.
DMN achieves over 90% attack success rate on GPT-4o, Gemini-2.5-pro and Claude Sonnet 4 by distributing instructions, supplying multimodal evidence, and adding number chain tasks across multiple images.
PQR framework generates diverse realistic queries to elicit QA agent failures, uncovering 23-78% more unhelpful responses than prior methods in e-commerce agent tests.
GAGPO computes step-aligned temporal advantages from grouped rollout samples without a learned critic, enabling stable policy optimization in multi-turn agent environments.
ICRL uses joint RL training of solver and critic with distribution-calibration re-weighting and role-wise advantage estimation to internalize critique into unassisted LLM performance, yielding 6.4-point gains on agentic tasks and 7.0 on math reasoning with Qwen3 models.
Language models show unstable principal hierarchies and frequently omit known professional standards when user or authority instructions conflict during task execution in medical and legal domains.
Cross-entropy method sampling reduces inferences needed to estimate five-nines LLM reliability by up to 156x on parameterized GSM8K templates, revealing reliability differences hidden by saturated accuracy scores.
Separate modality-specific reasoning before fusion reduces hallucinations and improves accuracy in audio-visual LLMs by enforcing isolated traces then integrating evidence.
Imagining in 360° decouples visual search into a single-step probabilistic semantic layout predictor and an actor, removing the need for multi-turn CoT reasoning and trajectory annotations while improving efficiency in 360° environments.
MedScribe reformulates CT radiology reporting as an agentic evidence-acquisition workflow using LLM-invoked diagnostic tools and pathology-aligned retrieval, yielding higher clinical accuracy and consistency than standard VLMs on CT-RATE and RadChestCT.
Verbal-R3 uses a verbal reranker to generate analytic narratives that guide retrieval and reasoning in LLMs, achieving SOTA results on complex QA benchmarks.
High OCR accuracy on standard metrics does not guarantee strong downstream RAG performance because structural and semantic errors cause retrieval and generation failures on challenging industrial documents.
A parameter-free decomposition in MoE models separates routing control from content, showing that expert trajectories cluster tokens by semantic function across languages and forms, making paths rather than experts the natural unit of interpretability.
PlanRAG-Audio introduces planning-based retrieval-augmented generation to improve accuracy and stability of long-form audio understanding in LALMs by decoupling model input from raw audio duration.
Qwen-Scope provides open-source sparse autoencoders for Qwen models that function as practical interfaces for steering, evaluating, data workflows, and optimizing large language models.
citing papers explorer
-
PlanRAG-Audio: Planning and Retrieval Augmented Generation for Long-form Audio Understanding
PlanRAG-Audio introduces planning-based retrieval-augmented generation to improve accuracy and stability of long-form audio understanding in LALMs by decoupling model input from raw audio duration.