Helm, and Carey E

URL https://arxiv · 2023 · arXiv 2305.05126

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Black-box model classification under the discriminative factorization

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Discriminative factorization distinguishes high-quality query sets for black-box model classification, with chance-level error decaying exponentially in query budget and parameters predicting empirical decay rates on auditing tasks.

Query-efficient model evaluation using cached responses

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

DKPS-based methods predict new model benchmark scores using cached responses, matching baseline mean absolute error with substantially fewer queries and an offline query selection approach.

Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models

cs.CR · 2026-05-26 · unverdicted · novelty 5.0

Behavioral geometry of model populations enables high-accuracy jailbreak susceptibility prediction and defense transfer with 98% fewer evaluations.

Control Charts for Multi-agent Systems

cs.MA · 2026-05-11 · unverdicted · novelty 5.0

Adaptive control charts can monitor learning multi-agent systems but are vulnerable to gradual adversarial defection, revealing a fundamental tradeoff between allowing agents to learn and maintaining security against adversaries.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Black-box model classification under the discriminative factorization cs.LG · 2026-05-08 · unverdicted · none · ref 10
Discriminative factorization distinguishes high-quality query sets for black-box model classification, with chance-level error decaying exponentially in query budget and parameters predicting empirical decay rates on auditing tasks.
Query-efficient model evaluation using cached responses cs.LG · 2026-05-08 · unverdicted · none · ref 8
DKPS-based methods predict new model benchmark scores using cached responses, matching baseline mean absolute error with substantially fewer queries and an offline query selection approach.
Jailbreak susceptibility prediction and mitigation via the behavioral geometry of models cs.CR · 2026-05-26 · unverdicted · none · ref 10
Behavioral geometry of model populations enables high-accuracy jailbreak susceptibility prediction and defense transfer with 98% fewer evaluations.
Control Charts for Multi-agent Systems cs.MA · 2026-05-11 · unverdicted · none · ref 9
Adaptive control charts can monitor learning multi-agent systems but are vulnerable to gradual adversarial defection, revealing a fundamental tradeoff between allowing agents to learn and maintaining security against adversaries.

Helm, and Carey E

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer