Mantas Mazeika
- 3works
- 3Pith-reviewed
- 100.0%Recognition coverage
- 0queued
works
- Humanity's Last Exam Pith 2025 · cs.LG · verdict UNVERDICTED · 8 external citations · 100 Pith citing
- HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal Pith 2024 · cs.LG · verdict UNVERDICTED · 118 Pith citing
- Measuring Massive Multitask Language Understanding Pith 2020 · cs.CY · verdict ACCEPT · 380 Pith citing