pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1797 papers in cs.SE · page 18

  1. cs.SE 2026-04-20 reviewed
    Two-agent system repairs LLM agent bugs more effectively

    SelfHeal: Empirical Fix Pattern Analysis and Bug Repair in LLM Agents

    Niful Islam +2

  2. cs.SE 2026-04-19 reviewed
    Three patterns mark how teams respond to GitHub Actions failures

    Beyond the YAML File: Understanding Real-World GitHub Actions Workflow Adoption

    Ali Khatami +2

  3. cs.AI 2026-04-19 reviewed
    Hugging Face data drives dynamic AI model card updates

    Toward Reusability of AI Models Using Dynamic Updates of AI Documentation

    Peter Bajcsy +1

  4. cs.SE 2026-04-19 reviewed
    AI code shows 1.8 times more quiet-failure risks than human code

    AIRA: AI-Induced Risk Audit: A Structured Inspection Framework for AI-Generated Code

    William M. Parris

  5. cs.SE 2026-04-19 reviewed
    Logging tools need multilingual checks to be reliable

    Single-Language Evidence Is Insufficient for Automated Logging: A Multilingual Benchmark and Empirical Study with LLMs

    Renyi Zhong +5

  6. cs.SE 2026-04-19 reviewed
    QRisk cuts quantum noise 45% by avoiding recurring error patterns

    Isolating Recurring Execution-Dependent Abnormal Patterns on NISQ Quantum Devices

    Zhenyu Qi +4

  7. cs.SE 2026-04-19 reviewed
    Analysis extracts unit tests from integration tests

    Augmenting unit test suites from integration tests

    Katerina Paltoglou +1

  8. cs.SE 2026-04-19 reviewed
    Technology research software forms its own overlooked category

    Technology Research Software: An Often Overlooked Category of Research Software

    Wilhelm Hasselbring +2

  9. cs.SE 2026-04-19 reviewed
    Reverse-engineered specs yield 94% APR success on Defects4J

    Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications

    Yongchao Wang +1

  10. cs.CY 2026-04-19 reviewed
    Adaptive AI personas teach coding tool use

    Agentic Education: Using Claude Code to Teach Claude Code

    Zain Naboulsi

  11. cs.SE 2026-04-19 reviewed
    Modeling projects as networks provides more consistent estimates of resilience to key…

    Project resilience as network robustness

    Sebastiano A. Piccolo +1

  12. cs.SE 2026-04-19 reviewed
    ML automation targets RISC-V certification costs for cars

    RISC-V Functional Safety for Autonomous Automotive Systems: An Analytical Framework and Research Roadmap for ML-Assisted Certification

    Nick Andreasyan +4

  13. cs.SE 2026-04-19 reviewed
    Models pass tests by regenerating code

    Precise Debugging Benchmark: Is Your Model Debugging or Regenerating?

    Wang Bill Zhu +7

  14. cs.SE 2026-04-19 reviewed
    LLMs pass 76% of tests but edit with under 45% precision

    Precise Debugging Benchmark: Is Your Model Debugging or Regenerating?

    Wang Bill Zhu +7

  15. cs.SE 2026-04-19 reviewed
    LLMs detect design patterns with promising accuracy

    A Pilot Study on Detecting Software Design Patterns with Large Language Models: An Empirical Evaluation

    Oishik Chowdhury +2

  16. cs.SE 2026-04-19 reviewed
    KnowPilot improves domain text generation by merging priors

    KnowPilot: Your Knowledge-Driven Copilot for Domain Tasks

    Zekun Xi +7

  17. cs.SE 2026-04-19 reviewed
    T2MRec matches tasks to MCP servers via semantic and structural cues

    From Language to Action: Enhancing LLM Task Efficiency with Task-Aware MCP Server Recommendation

    Shiyu He +5

  18. cs.SE 2026-04-19 reviewed
    Kimi-K2.5 at 3 bits tops models on React Native app task

    React-ing to Grace Hopper 200: Five Open-Weights Coding Models, One React Native App, One GH200, One Weekend

    Alex Potanin

  19. cs.SE 2026-04-19 reviewed
    Personas in requirements engineering align clinical AI trainers with real practice

    Persona-Based Requirements Engineering for Explainable Multi-Agent Educational Systems: A Scenario Simulator for Clinical Reasoning Training

    Weibing Zheng +5

  20. cs.SE 2026-04-19 reviewed
    Adaptive router lifts LLM code repair accuracy by 32 percent

    SynthFix: Adaptive Neuro-Symbolic Code Vulnerability Repair

    Yifan Zhang +4

  21. cs.SE 2026-04-19 reviewed
    MoE routing overlaps 11x random even for different code tokens

    Layer-wise MoE Routing Locality under Shared-Prefix Code Generation: Token-Identity Decomposition and Compile-Equivalent Fork Redundancy

    Shun-ichiro Hayashi +3

  22. cs.SE 2026-04-18 reviewed
    Agentic AI governance misses links from rules to provable actions

    Beyond Task Success: An Evidence-Synthesis Framework for Evaluating, Governing, and Orchestrating Agentic AI

    Christopher Koch +1

  23. cs.SE 2026-04-18 reviewed
    Real token tracking matches AI dev costs within 2%

    AI Observability for Developer Productivity Tools: Bridging Cost Awareness and Code Quality

    Happy Bhati +1

  24. cs.SE 2026-04-18 reviewed
    Local command center unifies dev tools and raises AI readiness

    Workstream: A Local-First Developer Command Center for the AI-Augmented Engineering Workflow

    Happy Bhati

  25. cs.SE 2026-04-18 reviewed
    Transfer from C++ improves Ruby and Rust repair Pass@1 by 17 points

    HELO-APR: Enhancing Low-Resource Program Repair through Cross-Lingual Knowledge Transfer

    Zhipeng Wang +7

  26. cs.SE 2026-04-18 reviewed
    Memory cascade resolves 86% of Python dependency issues

    MEMRES: A Memory-Augmented Resolver with Confidence Cascade for Agentic Python Dependency Resolution

    Dao Sy Duy Minh +5

  27. cs.SE 2026-04-18 reviewed
    Co-versioning run-time behavior with code reveals hidden changes

    Treating Run-time Execution History as a First-Class Citizen: Co-Versioning Run-time Behavior alongside Code

    Marcus Kessel

  28. cs.SE 2026-04-18 reviewed
    Gleaner sampler raises RCA accuracy above full dataset at 1 percent rate

    Gleaner: A Semantically-Rich and Efficient Online Sampler for Microservice Diagnostics

    Yifan Yang (1) +4

  29. cs.SE 2026-04-18 reviewed
    Prompt tweaks flip LLM judge verdicts on identical code

    Bias in the Loop: Auditing LLM-as-a-Judge for Software Engineering

    Zixiao Zhao +2

  30. cs.SE 2026-04-18 reviewed
    App reviews flag persistent ethical barriers in mobile apps

    Exploring Ethical Concerns of Mobile Applications from App Reviews: A Literature Survey

    Aakash Sorathiya +1

  31. cs.SE 2026-04-18 reviewed
    Prompt method halves AI bias sensitivity in software tasks

    Mitigating Prompt-Induced Cognitive Biases in General-Purpose AI for Software Engineering

    Francesco Sovrano +2

  32. cs.SE 2026-04-17 reviewed
    AI slop creates a tragedy of the commons in software

    AI Slop and the Software Commons

    Sebastian Baltes +2

  33. cs.AI 2026-04-17 reviewed
    This paper empirically tests 22 agentic AI frameworks on three reasoning benchmarks and…

    Agentic Frameworks for Reasoning Tasks: An Empirical Study

    Zeeshan Rasheed +5

  34. cs.HC 2026-04-17 reviewed
    Conversational agents help high school students with CSP

    Investigating Conversational Agents to Support Secondary School Students Learning CSP

    Matthew Frazier +2

  35. cs.SE 2026-04-17 reviewed
    Survey of 280 researchers diagnoses barriers to cumulative knowledge in software

    From Papers to Progress: Rethinking Knowledge Accumulation in Software Engineering

    Jason Cusati +1

  36. cs.SE 2026-04-17 reviewed
    Fixing requirement mismatches raises LLM code success

    Bridging the Gap between User Intent and LLM: A Requirement Alignment Approach for Code Generation

    Jia Li +9

  37. cs.SE 2026-04-17 reviewed
    Multi-modal verifier raises certified synthesis success rate

    Certified Program Synthesis with a Multi-Modal Verifier

    Yueyang Feng +7

  38. cs.SE 2026-04-17 reviewed
    Contrastive training lifts LLM code detection accuracy to 78 percent

    LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning

    Mahir Labib Dihan +1

  39. cs.SE 2026-04-17 reviewed
  40. cs.AR 2026-04-17 reviewed
    MLIR unifies equivalence checking from algorithms to netlists

    EquivFusion: Unifying Hardware Equivalence Checking from Algorithms to Netlists via MLIR

    Jiaying Zhu +6

  41. cs.SE 2026-04-17 reviewed
    The paper introduces flowR, a VS Code and Positron extension that builds dataflow graphs…

    Supporting the Comprehension of Data Analysis Scripts

    Florian Sihler +4

  42. cs.SE 2026-04-17 reviewed
    Small programs can have up to 76 configuration options

    Small Yet Configurable: Unveiling Null Variability in Software

    Xhevahire T\"ernava +3

  43. cs.SE 2026-04-17 reviewed
    Removals lag additions so toggle counts keep rising in large systems

    Feature Toggle Dynamics in Large-Scale Systems: Prevalence, Growth, Lifespan, and Benchmarking

    Xhevahire T\"ernava

  44. cs.SE 2026-04-17 reviewed
    QMutBench gives 700k quantum mutants to benchmark tests

    QMutBench: A Dataset of Quantum Circuit Mutants

    E\~naut Mendiluze Usandizaga +3

  45. cs.SE 2026-04-17 reviewed
    Tool pairs LLMs with symbolic checks to create Python contracts

    SpecPylot: Python Specification Generation using Large Language Models

    Ragib Shahariar Ayon +1

  46. cs.SE 2026-04-17 reviewed
    LLM evolves coding skill by generating its own failure tests

    ACE: Self-Evolving LLM Coding Framework via Adversarial Unit Test Generation and Preference Optimization

    Yixu Huang +2

  47. cs.SE 2026-04-17 reviewed
    One LLM improves code by making its own adversarial tests

    ACE: Self-Evolving LLM Coding Framework via Adversarial Unit Test Generation and Preference Optimization

    Yixu Huang +2

  48. cs.SE 2026-04-17 reviewed
    Model unites text, code and images in one retrieval system

    CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval

    Jiahui Geng +3

  49. quant-ph 2026-04-17 reviewed
    The paper models quantum error budget allocation as a potential game among logical…

    A Game Theoretic Approach for Optimizing Quantum Error Budget Distribution

    Asif Akhtab Ronggon +1

  50. cs.SE 2026-04-16 reviewed
    Symbolic guardrails enforce 74% of agent safety policies

    Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility

    Yining Hong +4