STARE uses step-wise RL to attack multimodal models, achieving 68% higher attack success rate while revealing that adversarial optimization concentrates conceptual toxicity early and detail toxicity late in the generation trajectory.
ISBN 979-8-89176-189-6
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8verdicts
UNVERDICTED 8roles
background 1polarities
background 1representative citing papers
EDEN adaptively sets branching factor proportional to next-token entropy, achieving better accuracy per expansion than fixed beam search while providing a proof that monotone entropy-based branching outperforms any fixed budget allocation.
EngiAgent deploys a fully connected multi-agent coordinator to achieve higher feasibility rates when using LLMs to solve open-ended engineering problems under physical and data constraints.
FlowBot automatically induces LLM workflows through bilevel optimization with textual gradients, achieving competitive performance against human-crafted baselines.
CREATE is a benchmark that scores LLMs on their ability to produce many specific and diverse associative paths between concepts drawn from parametric knowledge.
CL-bench Life shows frontier language models achieve only 13.8% average success on real-life context tasks, with the best model at 19.3%.
BARRED uses dimension decomposition and asymmetric multi-agent debate to generate high-fidelity synthetic data that lets small fine-tuned models outperform proprietary LLMs and existing guardrail models on custom policies.
DOVE constructs a value codebook via rate-distortion variational optimization from 10K documents and measures LLM-human cultural alignment through unbalanced optimal transport, showing 31.56% correlation with downstream tasks and reliability at 500 samples per culture.
citing papers explorer
-
STARE: Step-wise Temporal Alignment and Red-teaming Engine for Multi-modal Toxicity Attack
STARE uses step-wise RL to attack multimodal models, achieving 68% higher attack success rate while revealing that adversarial optimization concentrates conceptual toxicity early and detail toxicity late in the generation trajectory.
-
Entropy-informed Decoding: Adaptive Information-Driven Branching
EDEN adaptively sets branching factor proportional to next-token entropy, achieving better accuracy per expansion than fixed beam search while providing a proof that monotone entropy-based branching outperforms any fixed budget allocation.
-
EngiAgent: Fully Connected Coordination of LLM Agents for Solving Open-ended Engineering Problems with Feasible Solutions
EngiAgent deploys a fully connected multi-agent coordinator to achieve higher feasibility rates when using LLMs to solve open-ended engineering problems under physical and data constraints.
-
FlowBot: Inducing LLM Workflows with Bilevel Optimization and Textual Gradients
FlowBot automatically induces LLM workflows through bilevel optimization with textual gradients, achieving competitive performance against human-crafted baselines.
-
CREATE: Testing LLMs for Associative Creativity
CREATE is a benchmark that scores LLMs on their ability to produce many specific and diverse associative paths between concepts drawn from parametric knowledge.
-
CL-bench Life: Can Language Models Learn from Real-Life Context?
CL-bench Life shows frontier language models achieve only 13.8% average success on real-life context tasks, with the best model at 19.3%.
-
BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate
BARRED uses dimension decomposition and asymmetric multi-agent debate to generate high-fidelity synthetic data that lets small fine-tuned models outperform proprietary LLMs and existing guardrail models on custom policies.
-
Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
DOVE constructs a value codebook via rate-distortion variational optimization from 10K documents and measures LLM-human cultural alignment through unbalanced optimal transport, showing 31.56% correlation with downstream tasks and reliability at 500 samples per culture.