arXiv preprint arXiv:2402.02047 (2024)

Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Susmit Jha, Prem Devanbu, Toufique Ahmed · 2024 · arXiv 2402.02047

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

When to Answer and When to Defer: A Decision Framework for Reliable Code Predictions

cs.SE · 2026-05-19 · unverdicted · novelty 5.0

Introduces a unified framework integrating uncertainty estimation, calibration, and tool-based abstention for reliable code predictions in language models.

From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap

cs.SE · 2024-10-28 · unverdicted · novelty 4.0

A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

Precision or Peril: A PoC of Python Code Quality from Quantized Large Language Models

cs.SE · 2024-11-16 · unverdicted · novelty 3.0

Smaller LLMs produce functional but limited Python code with variable quantization effects and quality/maintainability concerns that require validation before use.

citing papers explorer

Showing 3 of 3 citing papers.

When to Answer and When to Defer: A Decision Framework for Reliable Code Predictions cs.SE · 2026-05-19 · unverdicted · none · ref 26
Introduces a unified framework integrating uncertainty estimation, calibration, and tool-based abstention for reliable code predictions in language models.
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap cs.SE · 2024-10-28 · unverdicted · none · ref 96
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.
Precision or Peril: A PoC of Python Code Quality from Quantized Large Language Models cs.SE · 2024-11-16 · unverdicted · none · ref 20
Smaller LLMs produce functional but limited Python code with variable quantization effects and quality/maintainability concerns that require validation before use.

arXiv preprint arXiv:2402.02047 (2024)

fields

years

verdicts

representative citing papers

citing papers explorer