The Twelfth International Conference on Learning Representations , year=

SWE-bench: Can Language Models Resolve Real-world Github Issues? , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

MolViBench: Evaluating LLMs on Molecular Vibe Coding

cs.CL · 2026-05-04 · unverdicted · novelty 7.0

MolViBench is the first benchmark designed to evaluate LLMs on generating executable programs for molecular tasks in drug discovery.

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

cs.CR · 2026-05-07 · unverdicted · novelty 5.0

SafeHarbor introduces a hierarchical memory-augmented guardrail with adversarial rule extraction and entropy-driven self-evolution to balance safety and utility in LLM agents.

citing papers explorer

Showing 2 of 2 citing papers.

MolViBench: Evaluating LLMs on Molecular Vibe Coding cs.CL · 2026-05-04 · unverdicted · none · ref 16
MolViBench is the first benchmark designed to evaluate LLMs on generating executable programs for molecular tasks in drug discovery.
SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety cs.CR · 2026-05-07 · unverdicted · none · ref 7
SafeHarbor introduces a hierarchical memory-augmented guardrail with adversarial rule extraction and entropy-driven self-evolution to balance safety and utility in LLM agents.

The Twelfth International Conference on Learning Representations , year=

fields

years

verdicts

representative citing papers

citing papers explorer