pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

1797 papers in cs.SE · page 13

  1. quant-ph 2026-04-29 reviewed
    Quantum circuits cover conditions well but paths poorly

    Probabilistic Condition, Decision and Path Coverage of Circuit-based Quantum Programs

    Daniel Fortunato +2

  2. cs.AI 2026-04-29 reviewed
    MoE models match human graders on math rubrics where 70B model fails

    Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics

    Jatin Bhusal +3

  3. cs.SE 2026-04-29 reviewed
    Seven recommendations guide LLM adoption in software teams

    Recommendations for Efficient and Responsible LLM Adoption within Industrial Software Development

    Krishna Ronanki +5

  4. cs.SE 2026-04-29 reviewed
    Pipeline builds consistent graphs from C

    Graph Construction and Matching for Imperative Programs using Neural and Structural Methods

    Arshad Beg +2

  5. cs.SE 2026-04-29 reviewed
    Pipeline builds consistent graphs from C

    Graph Construction and Matching for Imperative Programs using Neural and Structural Methods

    Arshad Beg +2

  6. cs.SE 2026-04-29 reviewed
    Natural language scenarios generate higher-coverage tests than BDD

    PICKLES: a Natural Language Framework for Requirement Specification and Model-Based Testing

    Mar\'ia Bel\'en Rodr\'iguez +1

  7. cs.SE 2026-04-29 reviewed
    Solidity semantic clones detected with 97% recall via code and comments

    Identifying and Characterizing Semantic Clones of Solidity Functions

    Ermanno Francesco Sannini +6

  8. cs.SE 2026-04-29 reviewed
    Knowledge graph drives 3x faster documentation with 85% fewer tokens

    RepoDoc: A Knowledge Graph-Based Framework to Automatic Documentation Generation and Incremental Updates

    Dong Xu +4

  9. cs.SE 2026-04-29 reviewed
    Speculative decoding speeds up SE tasks more for small models

    An Empirical Study of Speculative Decoding on Software Engineering Tasks

    Yijia Li +3

  10. cs.SE 2026-04-29 reviewed
    LLMs vary widely in screening papers for software SLRs

    Beyond Accuracy: LLM Variability in Evidence Screening for Software Engineering SLRs

    Gilberto Sussumu Hida +2

  11. cs.SE 2026-04-29 reviewed
    Swarm optimizer cuts vehicle offload response times

    Towards Intelligent Computation Offloading in Dynamic Vehicular Networks: A Scalable Multilayer Pipeline

    Falk Dettinger +5

  12. cs.SE 2026-04-29 reviewed
    Asset shells keep OCL constraints inside MBSE models

    Asset Administration Shell-Based OCL Validation Framework for Model-Based System Engineering

    Om Parkash +4

  13. cs.SE 2026-04-29 reviewed
    Software engineering shifts from code generation to AI delegation

    Agentic AI in the Software Development Lifecycle: Architecture, Empirical Evidence, and the Reshaping of Software Engineering

    Happy Bhati

  14. cs.CR 2026-04-29 reviewed
    Only 23% of LLM-generated Rust crypto code compiles

    An Empirical Security Evaluation of LLM-Generated Cryptographic Rust Code

    Mohamed Elsayed +2

  15. cs.SE 2026-04-29 reviewed
    Survey finds disconnect between program structure and adaptive security tests

    Adaptive and AI-Augmented Security Testing: A Systematic Survey of Program Analysis, Feedback-Driven Testing, and Hybrid Learning-Based Approaches

    Michael Wienczkowski

  16. cs.SE 2026-04-29 reviewed
    Review shows LLMs automate data tasks in software engineering studies

    LLM-Assisted Empirical Software Engineering: Systematic Literature Review and Research Agenda

    Victoria Gomes +4

  17. cs.SE 2026-04-28 reviewed
    LLM observability layers mature but integration lags

    AI Observability for Large Language Model Systems: A Multi-Layer Analysis of Monitoring Approaches from Confidence Calibration to Infrastructure Tracing

    Twinkll Sisodia

  18. cs.SE 2026-04-28 reviewed
    LLM pipeline lifts bug report completeness from 8% to 96%

    ImproBR: Bug Report Improver Using LLMs

    Emre Furkan Akyol +2

  19. cs.SE 2026-04-28 reviewed
    Multi-view training detects AI code on unseen languages at 0.845 F1

    UCSC-NLP at SemEval-2026 Task 13: Multi-View Generalization and Diagnostic Analysis of Machine-Generated Code Detection

    Kargi Chauhan +1

  20. cs.SE 2026-04-28 reviewed
    LLM turns uncovered code into valid bug reports at 85 percent rate

    LLM-Guided Issue Generation from Uncovered Code Segments

    Diany Pressato +3

  21. cs.SE 2026-04-28 reviewed
    LLM tool turns uncovered code into prioritized bug reports

    LLM-Guided Issue Generation from Uncovered Code Segments

    Diany Pressato +3

  22. cs.SE 2026-04-28 reviewed
    Splitting code viewing from editing raises agent success 2.1% at 17.9% lower cost

    SWE-Edit: Rethinking Code Editing for Efficient SWE-Agent

    Yikai Zhang +11

  23. cs.CR 2026-04-28 reviewed
    GenDetect turns a single observed DeFi attack into reusable detection rules by…

    GenDetect: Generalizing Reactive Detection for Resilience Against Imitative DeFi Attack Cascade

    Bowen Cai +6

  24. cs.SE 2026-04-28 reviewed
    Carbon-tax ordering cuts LLM memory by up to 49x

    Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models

    Ajmain Inqiad Alam +4

  25. cs.SE 2026-04-28 reviewed
    Multi-LLM pipeline extracts 734 trajectories from GitHub issues

    From Threads to Trajectories: A Multi-LLM Pipeline for Community Knowledge Extraction from GitHub Issue Discussions

    Nazia Shehnaz Joynab +1

  26. cs.SE 2026-04-28 reviewed
    LLM REST tests lose effectiveness on faulty code and vague specs

    RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements

    Leon Kogler +5

  27. cs.CL 2026-04-28 reviewed
    Evolved harnesses raise coding-agent pass@1 from 69.7% to 77%

    Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

    Jiahang Lin +10

  28. cs.CL 2026-04-28 reviewed
    Ten AHE iterations lift coding-agent pass@1 to 77%

    Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

    Jiahang Lin +10

  29. cs.SE 2026-04-28 reviewed
    RSEs form a collective identity that shapes their wellbeing

    Does social identity matter in software engineering? Assessing the case of research software engineers

    Chukwudi Uwasomba +7

  30. cs.SE 2026-04-28 reviewed
    Developer roles drive microservices coupling more than architecture

    Key Developer Roles and Organizational Coupling in Microservices: A Longitudinal Analysis

    Xiaozhou Li +3

  31. cs.SE 2026-04-28 reviewed
    Code metrics match plagiarism tools in ranking performance

    Can Code Evaluation Metrics Detect Code Plagiarism?

    Fahad Ebrahim +1

  32. cs.SE 2026-04-28 reviewed
    Scenarios compose into online tests for robot systems

    Scenario-based System Testing for Distributed Robotics Applications

    Jan Peleska +3

  33. cs.SE 2026-04-28 reviewed
    Multi-agent editing lifts code success to 68.6 percent

    SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

    Noam Tarshish +6

  34. cs.SE 2026-04-28 reviewed
    Code-comment alignment lifts F1 scores by up to 27% in vulnerability detection

    Learning Generalizable Multimodal Representations for Software Vulnerability Detection

    Zeming Dong +7

  35. cs.SE 2026-04-28 reviewed
    Classical ML beats transformers for bug report fault localization

    Bug-Report-Driven Fault Localization: Industrial Benchmarking and Lesson Learned at ABB Robotics

    Pernilla Hall +3

  36. cs.SE 2026-04-28 reviewed
    Bug report text trains models to find faults in robotics code

    Bug-Report-Driven Fault Localization: Industrial Benchmarking and Lesson Learned at ABB Robotics

    Pernilla Hall +3

  37. cs.SE 2026-04-28 reviewed
    GPT tools draft spreadsheet models but fail to reproduce them consistently

    Spreadsheet Modeling Experiments Using GPTs on Small Problem Statements and the Wall Task

    Thomas A. Grossman +2

  38. cs.SE 2026-04-28 reviewed
    LLMs generate Given-When-Then tests for FMU simulations

    Using Large Language Models for Black-Box Testing of FMU-Based Simulations

    Abdullah Mughees +5

  39. cs.SE 2026-04-28 reviewed
    PLM choice outweighs GNN backbone in code hybrid models

    PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection

    Mohamed Taoufik Kaouthar El Idrissi +2

  40. cs.SE 2026-04-28 reviewed
    12,000 tests quantify energy costs of mobile settings

    An Empirical Analysis of Mobile Energy Consumption Across User Configurations

    Wellington Oliveira

  41. cs.SE 2026-04-28 reviewed
    MBSE models must be co-designed as AI-queryable knowledge bases

    AI as Consumer and Participant: A Co-Design Agenda for MBSE Substrates and Methodology

    Siyuan Ji

  42. cs.SE 2026-04-28 reviewed
    MLLMs suggest ranked usability fixes from videos

    Recommending Usability Improvements with Multimodal Large Language Models

    Sebastian Lubos +4

  43. cs.SE 2026-04-28 reviewed
    LLMs inconsistent on equivalent code versions

    CoRE: A Fine-Grained Code Reasoning Benchmark Beyond Output Prediction

    Jun Gao +8

  44. cs.SE 2026-04-28 reviewed
    Commit structure lifts test prioritization in CI

    Commit-Aware Learning-Based Test Case Prioritization for Continuous Integration

    Lorenzo Abbondante +1

  45. cs.SE 2026-04-28 reviewed
    R³-SQL reaches 75.03 accuracy on BIRD-dev for Text-to-SQL

    R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

    Hojae Han +4

  46. cs.DB 2026-04-28 reviewed
    VisualNeo connects visual queries to Neo4j for graph searches

    VisualNeo: Bridging the Gap between Visual Query Interfaces and Graph Query Engines

    Kai Huang +7

  47. cs.CR 2026-04-28 reviewed
    MARD is a multi-agent system that uses large language models to detect Android malware by…

    MARD: A Multi-Agent Framework for Robust Android Malware Detection

    Xueying Zeng +6

  48. cs.LG 2026-04-28 reviewed
    DiRe preserves 3-4 times more topology than UMAP at equal speed

    DiRe-RAPIDS: Topology-faithful dimensionality reduction at scale

    Alexander Kolpakov +1

  49. cs.CR 2026-04-28 reviewed
    Conformance checking runs on homomorphically encrypted logs

    Secure Conformance Checking using Token-based Replay and Homomorphic Encryption

    Luis-Armando Rodr\'iguez-Flores +3

  50. cs.CR 2026-04-28 reviewed
    Four agents turn incomplete Rust CVEs into analyzable tests

    Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets

    Zeyad Abdelrazek +1