ConflictQA benchmark shows LLMs fail to resolve conflicts between text and KG evidence and often default to one source, motivating the XoT explanation-based reasoning method.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5representative citing papers
PaintCopilot models painting as an open-ended autoregressive process that predicts coherent brushstrokes from partial canvas observations using a ViT target predictor, flow-matching stroke generator, and VAE region sampler.
TeleCom-Bench reveals LLMs reach 90% on telecom intent and entity tasks but drop to 30% on solution generation and root cause analysis in live network scenarios.
CoX-MoE achieves up to 7.1x higher throughput than FlexGen for MoE inference via coalesced expert execution and AMX-enabled CPU-GPU orchestration with static expert stratification.
A review of 38 studies finds LLMs mostly target text-based accessibility tasks under WCAG guidelines, with limited attention to cognitive issues and rare direct involvement of disabled users in evaluations.
citing papers explorer
-
Exploring Knowledge Conflicts for Faithful LLM Reasoning: Benchmark and Method
ConflictQA benchmark shows LLMs fail to resolve conflicts between text and KG evidence and often default to one source, motivating the XoT explanation-based reasoning method.
-
PaintCopilot: Modeling Painting as Autonomous Artistic Continuation
PaintCopilot models painting as an open-ended autoregressive process that predicts coherent brushstrokes from partial canvas observations using a ViT target predictor, flow-matching stroke generator, and VAE region sampler.
-
TeleCom-Bench: How Far Are Large Language Models from Industrial Telecommunication Applications?
TeleCom-Bench reveals LLMs reach 90% on telecom intent and entity tasks but drop to 30% on solution generation and root cause analysis in live network scenarios.
-
CoX-MoE: Coalesced Expert Execution for High-Throughput MoE Inference with AMX-Enabled CPU-GPU Co-Execution
CoX-MoE achieves up to 7.1x higher throughput than FlexGen for MoE inference via coalesced expert execution and AMX-enabled CPU-GPU orchestration with static expert stratification.
-
Large Language Models for Web Accessibility: A Systematic Literature Review
A review of 38 studies finds LLMs mostly target text-based accessibility tasks under WCAG guidelines, with limited attention to cognitive issues and rare direct involvement of disabled users in evaluations.