Title resolution pending

SAFEx: Analyzing vulnerabilities of MoEbased LLMs via stable safety-critical expert identification · 2025 · arXiv 2506.17368

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

cs.CR · 2026-05-06 · unverdicted · novelty 7.0

Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

RouteHijack is a routing-aware jailbreak that identifies safety-critical experts via activation contrast and optimizes suffixes to suppress them, reaching 69.3% average attack success rate on seven MoE LLMs with strong transfer to variants and VLMs.

Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models

cs.CL · 2026-03-28 · unverdicted · novelty 7.0

Routing sensitivity in MoE models is necessary but insufficient for stereotype control because bias and knowledge remain entangled within expert groups and preference shifts do not transfer to generated text.

citing papers explorer

Showing 3 of 3 citing papers.

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs cs.CR · 2026-05-06 · unverdicted · none · ref 20
Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.
RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs cs.LG · 2026-05-01 · unverdicted · none · ref 26
RouteHijack is a routing-aware jailbreak that identifies safety-critical experts via activation contrast and optimizes suffixes to suppress them, reaching 69.3% average attack success rate on seven MoE LLMs with strong transfer to variants and VLMs.
Routing Sensitivity Without Controllability: A Diagnostic Study of Fairness in MoE Language Models cs.CL · 2026-03-28 · unverdicted · none · ref 6
Routing sensitivity in MoE models is necessary but insufficient for stereotype control because bias and knowledge remain entangled within expert groups and preference shifts do not transfer to generated text.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer