CodeMind evaluates ten LLMs on four benchmarks using three new code reasoning tasks, finding performance varies by model size and drops with complexity while showing no correlation with bug repair ability.
Understanding the effectiveness of large language models in code translation,
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
SWaRL trains code LLMs with RL using compiler correctness signals and a confidential verifier reward to embed robust, functionality-preserving watermarks that resist refactoring attacks.
Guided LLM strategies with custom datasets and execution-based verification enable functional APL-to-C# translation across a range of program complexities.
citing papers explorer
-
CodeMind: Evaluating Large Language Models for Code Reasoning
CodeMind evaluates ten LLMs on four benchmarks using three new code reasoning tasks, finding performance varies by model size and drops with complexity while showing no correlation with bug repair ability.
-
SWaRL: Safeguard Code Watermarking via Reinforcement Learning
SWaRL trains code LLMs with RL using compiler correctness signals and a confidential verifier reward to embed robust, functionality-preserving watermarks that resist refactoring attacks.
-
Neural Code Translation of Legacy Code: APL to C#
Guided LLM strategies with custom datasets and execution-based verification enable functional APL-to-C# translation across a range of program complexities.