hub Canonical reference

Nguyen and Raymond Choo

Chen, S · 2021 · arXiv 1524.2021

Canonical reference. 79% of citing Pith papers cite this work as background.

32 Pith papers citing it

Background 79% of classified citations

read on arXiv browse 32 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 11 other 2 method 1

citation-polarity summary

background 11 unclear 2 use method 1

representative citing papers

Quantum Mutant Equivalence via Transpilation

cs.SE · 2026-06-25 · unverdicted · novelty 7.0

TBE identifies 32.1% of 92,011 equivalent surviving quantum mutants (29,536) via OpenQASM comparison after transpilation, reporting 100% precision and 82% accuracy on 348,299 mutants.

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs

cs.AI · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

LGMT is a logic-grounded metamorphic testing framework that detects hidden reasoning defects in LLMs by checking consistency on semantically invariant inputs derived from FOL equivalences.

ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real Programs

cs.SE · 2026-05-01 · unverdicted · novelty 7.0

ClozeMaster masks bracketed structures in historical Rust bug code and uses LLMs to infill them, generating test programs that discovered 27 confirmed bugs in rustc and mrustc while outperforming existing fuzzers.

GraphQLify: Automated and Type Safety-Preserving GraphQL API Adoption

cs.SE · 2026-04-16 · unverdicted · novelty 7.0

GraphQLify automates REST-to-GraphQL migration via static source code analysis, delivering 100% type-safe conversions on 834 APIs and 2-4x faster performance than REST for multi-call workflows.

A Methodological Analysis of Empirical Studies in Quantum Software Testing

quant-ph · 2026-01-13 · accept · novelty 7.0 · 2 refs

A systematic analysis of 59 quantum software testing empirical studies reveals highly diverse designs, inconsistent reporting, and open methodological challenges, leading to recommendations for future work.

EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention

cs.SE · 2025-08-22 · unverdicted · novelty 7.0

EyeMulator augments CodeLLM fine-tuning loss with token weights derived from human eye-tracking scan paths, producing large gains on code translation and summarization across StarCoder, Llama-3.2 and DeepSeek-Coder.

Do Machines Struggle Where Humans Do? LLM and Human Comprehension of Obfuscated Code

cs.SE · 2026-06-30 · unverdicted · novelty 6.0

Reasoning-tuned LLMs align with human comprehension failure patterns under code obfuscation using the Block Model, unlike instruction-tuned variants.

ATTAIN: Automated Exploit Failure Analysis through Trace-Driven Diff Analysis

cs.SE · 2026-06-08 · unverdicted · novelty 6.0

ATTAIN is a three-module trace-driven framework that combines exploit execution, LLM-guided diff search, and evidence-based judgment to identify affected library versions for CVEs, reporting 93.24% F1 on 224 CVEs across 25,943 versions.

PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing

cs.CR · 2026-06-01 · unverdicted · novelty 6.0

PeAR shows static binary instrumentation can instrument 88% of FUZZBENCH targets with 4x throughput gains and coverage matching compiler-based methods.

QUTest: A Native Testing Framework for Quantum Programs

quant-ph · 2026-05-19 · unverdicted · novelty 6.0

QUTest is a native OpenQASM testing framework that encodes Arrange/Act/Assert tests and 12 assertion types via pragma comments while remaining compatible with existing tools.

Robust Mutation Analysis of Quantum Programs Under Noise

cs.SE · 2026-05-13 · conditional · novelty 6.0 · 2 refs

Noise from quantum hardware simulators significantly alters mutant detection distances, making equivalent mutants harder to separate from faults, with output-distribution metrics reaching 73.03% accuracy and 74.89% F1-score under device-specific thresholds.

Code-Centric Detection of Vulnerability-Fixing Commits: A Unified Benchmark and Empirical Study

cs.SE · 2026-05-13 · accept · novelty 6.0

Code language models show no transferable security understanding from code diffs alone, rely on commit messages, miss over 93% of fixes at 0.5% false positive rate, and suffer large drops under group or temporal splits.

Quality-Driven Selective Mutation for Deep Learning

cs.SE · 2026-04-24 · unverdicted · novelty 6.0

A dual-axis quality framework ranks DL mutation operators by statistical resistance and Jaccard-based realism to real faults, enabling up to 55.6% fewer mutants on held-out validation data without dropping baseline performance.

Ethics Testing: Proactive Identification of Generative AI System Harms

cs.SE · 2026-04-23 · unverdicted · novelty 6.0

Ethics testing is introduced as a systematic approach to generate tests that identify software harms induced by unethical behavior in generative AI outputs.

QuanForge: A Mutation Testing Framework for Quantum Neural Networks

cs.SE · 2026-04-22 · unverdicted · novelty 6.0

QuanForge introduces statistical mutation killing and nine post-training mutation operators for QNNs to distinguish test suites and localize vulnerable circuit regions.

Co-Located Tests, Better AI Code: How Test Syntax Structure Affects Foundation Model Code Generation

cs.SE · 2026-04-20 · unverdicted · novelty 6.0

Co-locating tests with implementation code yields substantially higher preservation and correctness in foundation-model-generated programs than separated test syntax.

A Comparative Study of Semantic Log Representations for Software Log-based Anomaly Detection

cs.SE · 2026-04-09 · unverdicted · novelty 6.0

QTyBERT matches or exceeds BERT-based log anomaly detection effectiveness while reducing embedding generation time to near static word embedding levels.

Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance in LLM-based Issue Resolution

cs.SE · 2026-04-07 · unverdicted · novelty 6.0

LLM agents resolve fewer than half of issues while satisfying design constraints despite passing tests, as shown by a benchmark of 495 issues and 1787 constraints from six repositories.

Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study

cs.SE · 2025-11-29 · unverdicted · novelty 6.0

APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.

Multi-LLM Orchestration for High-Quality Code Generation: Exploiting Complementary Model Strengths

cs.SE · 2025-10-01 · conditional · novelty 6.0

PerfOrch is a four-agent multi-LLM system that uses offline profiling to build language-and-category rankings for routing tasks, achieving 97.19% and 95.83% pass@1 on HumanEval-X and EffiBench-X with generalization across benchmarks.

Constrained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach

cs.SE · 2025-09-20 · unverdicted · novelty 6.0

CoCoMagic applies constrained cooperative co-evolution to metamorphic and differential testing to find up to 287% more distinct behavioral divergences in an end-to-end ADS than baseline search methods.

UntrustVul: An Automated Approach for Identifying Untrustworthy Alerts in Vulnerability Detection Models

cs.SE · 2025-03-19 · unverdicted · novelty 6.0

UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.

MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing

cs.SE · 2024-08-28 · unverdicted · novelty 6.0

MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.

MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases

cs.SE · 2023-04-15 · unverdicted · novelty 6.0

MR-Scout extracts over 11,000 metamorphic-relation-encoded test cases from 701 OSS projects, codifies 97% of them as high-quality generators, and shows they raise line coverage by 13.52% and mutation score by 9.42% on programs that already have developer tests.

citing papers explorer

Showing 32 of 32 citing papers.

Quantum Mutant Equivalence via Transpilation cs.SE · 2026-06-25 · unverdicted · none · ref 3
TBE identifies 32.1% of 92,011 equivalent surviving quantum mutants (29,536) via OpenQASM comparison after transpilation, reporting 100% precision and 82% accuracy on 348,299 mutants.
LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs cs.AI · 2026-05-12 · unverdicted · none · ref 2 · 2 links
LGMT is a logic-grounded metamorphic testing framework that detects hidden reasoning defects in LLMs by checking consistency on semantically invariant inputs derived from FOL equivalences.
ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real Programs cs.SE · 2026-05-01 · unverdicted · none · ref 8
ClozeMaster masks bracketed structures in historical Rust bug code and uses LLMs to infill them, generating test programs that discovered 27 confirmed bugs in rustc and mrustc while outperforming existing fuzzers.
GraphQLify: Automated and Type Safety-Preserving GraphQL API Adoption cs.SE · 2026-04-16 · unverdicted · none · ref 34
GraphQLify automates REST-to-GraphQL migration via static source code analysis, delivering 100% type-safe conversions on 834 APIs and 2-4x faster performance than REST for multi-call workflows.
A Methodological Analysis of Empirical Studies in Quantum Software Testing quant-ph · 2026-01-13 · accept · none · ref 73 · 2 links
A systematic analysis of 59 quantum software testing empirical studies reveals highly diverse designs, inconsistent reporting, and open methodological challenges, leading to recommendations for future work.
EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention cs.SE · 2025-08-22 · unverdicted · none · ref 40
EyeMulator augments CodeLLM fine-tuning loss with token weights derived from human eye-tracking scan paths, producing large gains on code translation and summarization across StarCoder, Llama-3.2 and DeepSeek-Coder.
Do Machines Struggle Where Humans Do? LLM and Human Comprehension of Obfuscated Code cs.SE · 2026-06-30 · unverdicted · none · ref 47
Reasoning-tuned LLMs align with human comprehension failure patterns under code obfuscation using the Block Model, unlike instruction-tuned variants.
ATTAIN: Automated Exploit Failure Analysis through Trace-Driven Diff Analysis cs.SE · 2026-06-08 · unverdicted · none · ref 70
ATTAIN is a three-module trace-driven framework that combines exploit execution, LLM-guided diff search, and evidence-based judgment to identify affected library versions for CVEs, reporting 93.24% F1 on 224 CVEs across 25,943 versions.
PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing cs.CR · 2026-06-01 · unverdicted · none · ref 9
PeAR shows static binary instrumentation can instrument 88% of FUZZBENCH targets with 4x throughput gains and coverage matching compiler-based methods.
QUTest: A Native Testing Framework for Quantum Programs quant-ph · 2026-05-19 · unverdicted · none · ref 13
QUTest is a native OpenQASM testing framework that encodes Arrange/Act/Assert tests and 12 assertion types via pragma comments while remaining compatible with existing tools.
Robust Mutation Analysis of Quantum Programs Under Noise cs.SE · 2026-05-13 · conditional · none · ref 47 · 2 links
Noise from quantum hardware simulators significantly alters mutant detection distances, making equivalent mutants harder to separate from faults, with output-distribution metrics reaching 73.03% accuracy and 74.89% F1-score under device-specific thresholds.
Code-Centric Detection of Vulnerability-Fixing Commits: A Unified Benchmark and Empirical Study cs.SE · 2026-05-13 · accept · none · ref 69
Code language models show no transferable security understanding from code diffs alone, rely on commit messages, miss over 93% of fixes at 0.5% false positive rate, and suffer large drops under group or temporal splits.
Quality-Driven Selective Mutation for Deep Learning cs.SE · 2026-04-24 · unverdicted · none · ref 38
A dual-axis quality framework ranks DL mutation operators by statistical resistance and Jaccard-based realism to real faults, enabling up to 55.6% fewer mutants on held-out validation data without dropping baseline performance.
Ethics Testing: Proactive Identification of Generative AI System Harms cs.SE · 2026-04-23 · unverdicted · none · ref 19
Ethics testing is introduced as a systematic approach to generate tests that identify software harms induced by unethical behavior in generative AI outputs.
QuanForge: A Mutation Testing Framework for Quantum Neural Networks cs.SE · 2026-04-22 · unverdicted · none · ref 40
QuanForge introduces statistical mutation killing and nine post-training mutation operators for QNNs to distinguish test suites and localize vulnerable circuit regions.
Co-Located Tests, Better AI Code: How Test Syntax Structure Affects Foundation Model Code Generation cs.SE · 2026-04-20 · unverdicted · none · ref 16
Co-locating tests with implementation code yields substantially higher preservation and correctness in foundation-model-generated programs than separated test syntax.
A Comparative Study of Semantic Log Representations for Software Log-based Anomaly Detection cs.SE · 2026-04-09 · unverdicted · none · ref 16
QTyBERT matches or exceeds BERT-based log anomaly detection effectiveness while reducing embedding generation time to near static word embedding levels.
Does Pass Rate Tell the Whole Story? Evaluating Design Constraint Compliance in LLM-based Issue Resolution cs.SE · 2026-04-07 · unverdicted · none · ref 32
LLM agents resolve fewer than half of issues while satisfying design constraints despite passing tests, as shown by a benchmark of 495 issues and 1787 constraints from six repositories.
Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study cs.SE · 2025-11-29 · unverdicted · none · ref 54
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
Multi-LLM Orchestration for High-Quality Code Generation: Exploiting Complementary Model Strengths cs.SE · 2025-10-01 · conditional · none · ref 24
PerfOrch is a four-agent multi-LLM system that uses offline profiling to build language-and-category rankings for routing tasks, achieving 97.19% and 95.83% pass@1 on HumanEval-X and EffiBench-X with generalization across benchmarks.
Constrained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach cs.SE · 2025-09-20 · unverdicted · none · ref 26
CoCoMagic applies constrained cooperative co-evolution to metamorphic and differential testing to find up to 287% more distinct behavioral divergences in an end-to-end ADS than baseline search methods.
UntrustVul: An Automated Approach for Identifying Untrustworthy Alerts in Vulnerability Detection Models cs.SE · 2025-03-19 · unverdicted · none · ref 39
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing cs.SE · 2024-08-28 · unverdicted · none · ref 5
MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.
MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases cs.SE · 2023-04-15 · unverdicted · none · ref 2
MR-Scout extracts over 11,000 metamorphic-relation-encoded test cases from 701 OSS projects, codifies 97% of them as high-quality generators, and shows they raise line coverage by 13.52% and mutation score by 9.42% on programs that already have developer tests.
Context-Based Adversarial Attacks on AI Code Generators: Vulnerability Analysis and Implications cs.CR · 2026-06-09 · unverdicted · none · ref 18
Context-based adversarial attacks raise vulnerable code generation in models like GPT-4 and CodeLlama from 3.5% to 37.4%, with 60-100% transferability, and a dual-layer defense reaches 89.1% detection at low false positives.
Boosting Automatic Java-to-Cangjie Translation with Multi-Stage LLM Training and Error Repair cs.SE · 2026-05-08 · unverdicted · none · ref 3
Multi-stage LLM training plus compiler-guided error repair boosts functional equivalence in Java-to-Cangjie translation by 6.06% over prior methods despite scarce parallel data.
Accelerating Policy Synthesis in Large-Scale MDPs via Hierarchical Adaptive Refinement cs.AI · 2025-06-21 · unverdicted · none · ref 51
Presents hierarchical adaptive refinement to accelerate near-optimal policy synthesis in MDPs up to 1M states with up to 2x speedup over PRISM and formal error bounds.
Context-Aware Unit Testing for Quantum Subroutines quant-ph · 2025-06-12 · unverdicted · none · ref 46
Proposes a context-aware unit testing framework for quantum subroutines modeled as parametrized quantum channels, using probabilistic assertions and demonstrated on GHZ preparation and Shor's algorithm subroutines.
DeepFWI: Identifying Bug-Sensitive Warnings with Multi-Modal Code-Warning Semantics cs.SE · 2024-03-24 · conditional · none · ref 56
DeepFWI is a multi-modal LSTM model with cross-attention that identifies bug-sensitive warnings at warning granularity, reaching 67.06% F1 on a 280k-warning dataset and surfacing 25 confirmed bugs in four open-source projects.
HYDRA: A Hybrid Heuristic-Guided Deep Representation Architecture for Predicting Latent Zero-Day Vulnerabilities in Patched Functions cs.CR · 2025-11-09 · unverdicted · none · ref 20
HYDRA is a hybrid model that uses heuristics plus deep embeddings and a VAE to predict latent zero-day vulnerabilities in patched functions from Chrome, Android, and ImageMagick.
Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap cs.SE · 2025-05-26 · unverdicted · none · ref 118
A research roadmap analyzing the current state of search-based software engineering with foundation models, outlining challenges and directions across three integration aspects.
Software Engineering for Self-Adaptive Robotics: A Research Agenda cs.SE · 2025-05-26 · unverdicted · none · ref 133
This paper proposes a research agenda for software engineering of self-adaptive robotic systems along lifecycle stages and enabling technologies, identifying challenges and a roadmap to 2030.

Nguyen and Raymond Choo

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer