pith. sign in

A tool for in-depth analysis of code execution reasoning of large language models,

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.SE 3 cs.LG 1

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

representative citing papers

CodeMind: Evaluating Large Language Models for Code Reasoning

cs.SE · 2024-02-15 · unverdicted · novelty 7.0

CodeMind evaluates ten LLMs on four benchmarks using three new code reasoning tasks, finding performance varies by model size and drops with complexity while showing no correlation with bug repair ability.

PrismaDV: Automated Task-Aware Data Unit Test Generation

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

PrismaDV generates task-aware data unit tests by jointly analyzing downstream code and dataset profiles, outperforming task-agnostic baselines on new benchmarks spanning 60 tasks, with SIFTA enabling automatic prompt optimization that beats hand-written prompts.

citing papers explorer

Showing 4 of 4 citing papers.