A Creator-Inspector multi-agent LLM pipeline for constitutive artificial neural networks increases the rate of models satisfying all nine physical constraints to 100% or 56% depending on the LLM backbone.
arXiv preprint arXiv:2510.11661 , year=
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A knowledge-first approach to LLM-driven automatic heuristic design in combinatorial optimization yields better discovery efficiency, transfer, and generalization than code-centric baselines by formalizing a distortion-compression trade-off.
LLMs predict outcomes of real scientific experiments at 14-26% accuracy, comparable to human experts, but lack calibration on prediction reliability while humans demonstrate strong calibration.
LLM-ODE integrates large language models into genetic programming to guide symbolic search for governing equations of dynamical systems, outperforming classical GP on 91 test cases in efficiency and solution quality.
Decentralized AI agent teams self-organize around hypotheses, critique proposals, and share knowledge to outperform single-agent baselines on biomedical ML, language-model optimization, and protein fitness tasks.
Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.
Glia deploys a multi-agent LLM workflow with reasoning, experimentation, and analysis agents to generate interpretable algorithms for request routing, scheduling, and auto-scaling in distributed GPU clusters, reaching human-expert performance levels.
citing papers explorer
-
ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents
Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.