LASH adaptively composes multiple jailbreak seed prompts via genetic search over subsets and mixture weights to reach 84.5% keyword ASR and 74.5% two-stage ASR on JailbreakBench while using only 30 queries per prompt.
hub
Qwen2.5: A party of foundation models, September 2024
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
TiCo enables spoken dialogue models to follow explicit time constraints in generated responses using Spoken Time Markers and reinforcement learning with verifiable rewards, cutting duration error by 2.7x over its backbone.
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
ROMA improves MLLM robustness to seen and unseen visual corruptions by +2.3-2.4% over GRPO on seven reasoning benchmarks while matching clean accuracy.
Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.
ML-Bench is a multilingual safety benchmark derived from actual regional laws and regulations, paired with ML-Guard guardrail models that outperform 11 baselines on existing and new benchmarks.
ASPIRE learns adaptive graph filters via bi-level optimization to overcome low-frequency explosion bias in spectral collaborative filtering, achieving strong performance and stability.
Entropy minimization on self-generated outputs elicits strong reasoning in pretrained LLMs, matching or exceeding supervised RL methods on benchmarks.
SFT on LLMs removes noise-like token interactions in a brief early phase before introducing overfitted ones, explaining inconsistent effectiveness across model scales.
citing papers explorer
-
Training Transformers for KV Cache Compressibility
Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.