SWE-bench: Can language models resolve real-world github issues? In The Twelfth International Conference on Learning Representations, 2024

Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Oﬁr Press, Karthik R Narasimhan · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

DBES: A Systematic Benchmark and Metric Suite for Evaluating Expert Specialization in Large-Scale MoEs

cs.LG · 2026-05-18 · unverdicted · novelty 6.0

DBES supplies a multi-domain benchmark and five metrics (Routing Specialization, Normalized Effective Rank, Domain Isolation, Routing Stiffness Score, N-gram Expertise) that reveal distinct specialization patterns across MoE models and enable 66-94% domain gains with 15% training resources.

citing papers explorer

Showing 1 of 1 citing paper.

DBES: A Systematic Benchmark and Metric Suite for Evaluating Expert Specialization in Large-Scale MoEs cs.LG · 2026-05-18 · unverdicted · none · ref 29
DBES supplies a multi-domain benchmark and five metrics (Routing Specialization, Normalized Effective Rank, Domain Isolation, Routing Stiffness Score, N-gram Expertise) that reveal distinct specialization patterns across MoE models and enable 66-94% domain gains with 15% training resources.

SWE-bench: Can language models resolve real-world github issues? In The Twelfth International Conference on Learning Representations, 2024

fields

years

verdicts

representative citing papers

citing papers explorer