Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

· 2026 · cs.CR · arXiv 2604.15022

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R$^2$A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R$^2$A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R$^2$A} significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.

representative citing papers

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

cs.CR · 2026-05-06 · unverdicted · novelty 7.0

Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack

cs.CR · 2026-05-17 · unverdicted · novelty 6.0

LLM cascade systems are vulnerable to a new adversarial attack that simultaneously degrades accuracy and destroys the intended cost savings by targeting both the lightweight models and the escalation decision mechanism.

HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools

cs.CL · 2026-05-16 · unverdicted · novelty 6.0

HyDRA routes queries to cost-effective LLMs by predicting multi-dimensional capability requirements with a multi-head encoder and applying shortfall matching against configuration-defined model profiles, delivering up to 72.5 percent cost savings on coding benchmarks while remaining decoupled from具体

citing papers explorer

Showing 3 of 3 citing papers.

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs cs.CR · 2026-05-06 · unverdicted · none · ref 36 · internal anchor
Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.
When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack cs.CR · 2026-05-17 · unverdicted · none · ref 64 · internal anchor
LLM cascade systems are vulnerable to a new adversarial attack that simultaneously degrades accuracy and destroys the intended cost savings by targeting both the lightweight models and the escalation decision mechanism.
HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools cs.CL · 2026-05-16 · unverdicted · none · ref 15 · internal anchor
HyDRA routes queries to cost-effective LLMs by predicting multi-dimensional capability requirements with a multi-head encoder and applying shortfall matching against configuration-defined model profiles, delivering up to 72.5 percent cost savings on coding benchmarks while remaining decoupled from具体

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

fields

years

verdicts

representative citing papers

citing papers explorer