SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.
Toward Realistic AI-Generated Student Questions to Support Instructor Training
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8representative citing papers
HARMES is the first large-scale dataset to combine wrist IMU, environmental, and audio sensors for recognizing 15 household activities across over 80 hours of data from 20 participants.
A sequential diffusion framework generates controllable abdominal anatomies with a Volume Control Scalar that decouples organ size from body habitus, achieving Dice scores around 0.83 and reducing distributional mismatch by 73.6% in a hepatomegaly example.
A new multi-surface evidence framework for post-quantum TLS observability that combines passive, active, certificate, and registry data to assess endpoint capabilities across TLS 1.2/1.3 scenarios and outperforms prior analyzers in controlled tests and public campaigns.
OpenCLAW-Nexus uses a single discounted Beta-reputation model to unify reputation-based node selection, Rep-FedAvg aggregation, and reputation-aware BFT consensus, achieving Byzantine resilience in decentralized FL with 72.6% accuracy on non-IID CIFAR-10 under 20% attacks.
Physical neural computing platforms using diverse materials offer complementary strengths for efficient on-device AI, with no single substrate excelling in all dimensions.
A survey catalogs text and speech resources for Hausa and Fongbe, documenting sizes, domains, licensing, and gaps including limited Fongbe text diversity and missing Hausa speech corpora.
Humans perform at chance levels when distinguishing generative AI content from human content in text, images, and voice.
citing papers explorer
-
SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair
SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.
-
HARMES: A Multi-Modal Dataset for Wearable Human Activity Recognition with Motion, Environmental Sensing and Sound
HARMES is the first large-scale dataset to combine wrist IMU, environmental, and audio sensors for recognizing 15 household activities across over 80 hours of data from 20 participants.
-
AbdomenGen: Sequential Volume-Conditioned Diffusion Framework for Abdominal Anatomy Generation
A sequential diffusion framework generates controllable abdominal anatomies with a Volume Control Scalar that decouples organ size from body habitus, achieving Dice scores around 0.83 and reducing distributional mismatch by 73.6% in a hepatomegaly example.
-
Observability for Post-Quantum TLS Readiness: A Multi-Surface Evidence Framework
A new multi-surface evidence framework for post-quantum TLS observability that combines passive, active, certificate, and registry data to assess endpoint capabilities across TLS 1.2/1.3 scenarios and outperforms prior analyzers in controlled tests and public campaigns.
-
OpenCLAW-Nexus: A Self-Reinforcing Trust Framework for Byzantine-Resilient Decentralized Federated Learning
OpenCLAW-Nexus uses a single discounted Beta-reputation model to unify reputation-based node selection, Rep-FedAvg aggregation, and reputation-aware BFT consensus, achieving Byzantine resilience in decentralized FL with 72.6% accuracy on non-IID CIFAR-10 under 20% attacks.
-
Beyond Silicon: Materials, Mechanisms, and Methods for Physical Neural Computing
Physical neural computing platforms using diverse materials offer complementary strengths for efficient on-device AI, with no single substrate excelling in all dimensions.
-
A Survey of Text and Speech Resources for Hausa and Fongbe: Availability, Quality, and Gaps for NLP Development
A survey catalogs text and speech resources for Hausa and Fongbe, documenting sizes, domains, licensing, and gaps including limited Fongbe text diversity and missing Hausa speech corpora.
-
Is it Cake or is it AI? A Systematic Review of Human Uncertainty in Distinguishing Generative Artificial Intelligence Content
Humans perform at chance levels when distinguishing generative AI content from human content in text, images, and voice.