A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Dmitry Ignatov; Radu Timofte; Waleed Khalid

arxiv: 2512.04329 · v2 · pith:VLEBBCFBnew · submitted 2025-12-03 · 💻 cs.CV · cs.SE

A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Waleed Khalid , Dmitry Ignatov , Radu Timofte This is my paper

Pith reviewed 2026-05-21 17:46 UTC · model grok-4.3

classification 💻 cs.CV cs.SE

keywords retrieval-augmented generationneural network extractionPyTorch modulesreusable architecturescross-repository migrationLEMUR datasetcode validationvision models

0 comments

The pith

NN-RAG extracts and validates 941 executable neural modules from 19 PyTorch repositories, supplying 72 percent of novel architectures to the LEMUR dataset.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NN-RAG as a retrieval-augmented generation system that converts large, varied PyTorch codebases into a library of searchable and runnable neural network modules. It does so by resolving dependencies within each module's scope, preserving necessary imports during reconstruction, and applying validators to confirm that blocks are self-contained and functional. A sympathetic reader would care because this process promises to reduce repeated coding effort in vision research by making proven components portable across projects. When run on 19 major repositories the pipeline produced 1,289 candidate blocks, validated 941 of them at a 73 percent success rate, and found that over 80 percent were structurally unique, with NN-RAG accounting for roughly 72 percent of the new network structures added to the LEMUR dataset.

Core claim

NN-RAG performs scope-aware dependency resolution, import-preserving reconstruction, and validator-gated promotion to ensure every retrieved block is scope-closed, compilable, and runnable. Applied to 19 major repositories, the pipeline extracted 1,289 candidate blocks, validated 941 (73.0 percent), and showed that over 80 percent are structurally unique. Through multi-level de-duplication the method contributes the overwhelming majority of unique architectures to the LEMUR dataset, supplying approximately 72 percent of all novel network structures while also enabling automatic cross-repository migration of architectural patterns.

What carries the argument

NN-RAG's three-step pipeline of scope-aware dependency resolution, import-preserving reconstruction, and validator-gated promotion that turns raw repository code into complete, executable neural modules.

If this is right

Validated modules can be automatically regenerated with all dependencies intact when moved from one repository to another.
Multi-level de-duplication reveals that NN-RAG supplies far more structurally novel networks than other sources in the LEMUR dataset.
The neutral specifications of the extracted modules allow optional use with language models for synthesis or dataset registration.
The resulting library provides a provenance-tracked collection of neural architectures that supports systematic algorithmic discovery.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

A shared catalog built this way could let researchers combine modules from different projects to create hybrid architectures that no single repository contains.
If the extracted blocks remain stable across varied training conditions, the approach might eventually support standardized, reusable component libraries similar to those in conventional software engineering.
Scaling the same extraction process to additional frameworks beyond PyTorch would widen access to portable neural logic across the broader machine-learning community.

Load-bearing premise

The validator step accurately marks modules as scope-closed, compilable, and runnable without missing problems that only appear when the blocks are moved into new repositories or training regimes.

What would settle it

Taking a random sample of the 941 validated blocks, transplanting each into an unrelated fresh PyTorch repository, and checking whether they compile and run without any code changes would directly test whether the validation overestimates real-world usability.

Figures

Figures reproduced from arXiv: 2512.04329 by Dmitry Ignatov, Radu Timofte, Waleed Khalid.

**Figure 2.** Figure 2: Distribution of extracted neural network blocks across [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Seven-phase extraction pipeline from automated block [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Extraction and validation statistics showing 100% ex [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Processing indicators and generated code volume. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Neural network code deduplication pipeline. The NN-DUP curation flow applies (1) exact deduplication with prefix-aware canonicalization; (2) lexical near-deduplication via MinHash+LSH with Jaccard verification; (3) structural deduplication using AST fingerprints; and (4) a diversity top-up that increases representation for underrepresented families without reintroducing near-duplicates. We use this to m… view at source ↗

**Figure 7.** Figure 7: Top 10 CIFAR-10 models in the LEMUR dataset ranked by accuracy. The best model, identified and assembled using the NN-RAG framework (rag6d58587b76d7e03be409f7e7289d4a58), attains 92.81% on the standard CIFAR-10 test split; numbers on the bars denote exact values. 7. Conclusion & Future Work We introduced NN-RAG, a retrieval-augmented system that discovers, assembles, and validates reusable PyTorch compon… view at source ↗

read the original abstract

Reusing existing neural-network components is central to research efficiency, yet discovering, extracting, and validating such modules across thousands of open-source repositories remains difficult. We introduce NN-RAG, a retrieval-augmented generation system that converts large, heterogeneous PyTorch codebases into a searchable and executable library of validated neural modules. Unlike conventional code search or clone-detection tools, NN-RAG performs scope-aware dependency resolution, import-preserving reconstruction, and validator-gated promotion -- ensuring that every retrieved block is scope-closed, compilable, and runnable. Applied to 19 major repositories, the pipeline extracted 1,289 candidate blocks, validated 941 (73.0%), and demonstrated that over 80% are structurally unique. Through multi-level de-duplication (exact, lexical, structural), we find that NN-RAG contributes the overwhelming majority of unique architectures to the LEMUR dataset, supplying approximately 72% of all novel network structures. Beyond quantity, NN-RAG uniquely enables cross-repository migration of architectural patterns, automatically identifying reusable modules in one project and regenerating them, dependency-complete, in another context. To our knowledge, no other open-source system provides this capability at scale. The framework's neutral specifications further allow optional integration with language models for synthesis or dataset registration without redistributing third-party code. Overall, NN-RAG transforms fragmented vision code into a reproducible, provenance-tracked substrate for algorithmic discovery, offering a first open-source solution that both quantifies and expands the diversity of executable neural architectures across repositories.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NN-RAG gives a working pipeline for pulling validated neural modules out of PyTorch repos at scale and shows it supplies most of the unique structures in LEMUR, but the validator's ability to guarantee clean migration is the part that needs more evidence.

read the letter

NN-RAG turns fragmented PyTorch vision code into a library of reusable modules. The core idea is to use retrieval-augmented generation with scope-aware dependency resolution, import-preserving reconstruction, and a validator that only promotes blocks that are closed, compilable, and runnable. Applied to 19 repositories it extracted 1,289 candidates, kept 941 after validation, and found over 80 percent structurally unique. The multi-level de-duplication then shows NN-RAG accounting for roughly 72 percent of the novel architectures added to LEMUR. That concrete contribution and the cross-repository regeneration step are the parts that feel new relative to ordinary code search or clone detection work. The neutral specification that lets the system avoid redistributing third-party code is also a practical detail worth noting. The numbers give a clearer picture of architectural reuse than most prior efforts in this area. The main soft spot is the validator. The pipeline claims the promoted blocks are ready for migration, yet the description does not detail what checks go beyond static scope and isolated execution. If the tests skip full training loops, gradient behavior, or hardware-specific cases, some fraction of the 941 blocks could fail once transplanted. That would directly affect the uniqueness counts and the 72 percent figure. The paper does not appear to report error bars or post-migration verification, so the migration claim rests on an assumption that needs stronger support. This work is aimed at researchers who regularly reimplement common vision components or who want to study reuse patterns across projects. It also matters for anyone building or extending datasets like LEMUR. The engineering is grounded in real extractions and the claims are tied to measurable outputs rather than pure theory, so the paper deserves a serious referee. I would send it out with a request that reviewers examine the validator rules and any integration tests in detail.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces NN-RAG, a retrieval-augmented generation pipeline that extracts neural-network modules from heterogeneous PyTorch codebases via scope-aware dependency resolution, import-preserving reconstruction, and validator-gated promotion. Applied to 19 repositories, it reports 1,289 candidate blocks extracted, 941 validated (73.0%), over 80% structural uniqueness after multi-level de-duplication, and NN-RAG supplying approximately 72% of novel structures to the LEMUR dataset while enabling cross-repository migration of architectural patterns.

Significance. If the validator reliably certifies modules that remain functional and dependency-complete after transplantation into new repositories and training regimes, the work would provide a valuable open-source substrate for quantifying and expanding architectural diversity in computer vision, supporting reproducible reuse of algorithmic components across fragmented codebases.

major comments (2)

[Abstract] Abstract: the headline claims (1,289 candidates, 941 validated at 73%, 72% of novel LEMUR structures) rest on the validator correctly identifying scope-closed, compilable, and runnable modules; yet the abstract supplies no description of the validator rules, whether isolated forward-pass tests suffice or full training loops/gradient flow/hardware behaviors are exercised, or any error-bar analysis for false positives.
[Pipeline Description] Pipeline / validator-gated promotion: the description does not report integration or regime-shift tests on promoted blocks, leaving open the possibility that blocks compile in isolation but fail when transplanted due to untested cross-repository dependencies; this directly undermines the central claim of cross-repository migration and the 72% novelty contribution.

minor comments (1)

[Abstract] Abstract: the phrase 'over 80% are structurally unique' would benefit from an explicit definition of the structural similarity metric used in the multi-level de-duplication step.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below with clarifications drawn directly from the manuscript and indicate revisions where they strengthen the presentation without altering the core claims.

read point-by-point responses

Referee: [Abstract] Abstract: the headline claims (1,289 candidates, 941 validated at 73%, 72% of novel LEMUR structures) rest on the validator correctly identifying scope-closed, compilable, and runnable modules; yet the abstract supplies no description of the validator rules, whether isolated forward-pass tests suffice or full training loops/gradient flow/hardware behaviors are exercised, or any error-bar analysis for false positives.

Authors: We agree that the abstract would benefit from a concise summary of the validator. Section 3 of the manuscript specifies that the validator enforces scope closure via dependency resolution, confirms compilability through import-preserving reconstruction, and executes isolated forward-pass tests to verify basic runnability. Full training loops, gradient flow, or hardware-specific behaviors are outside the validation scope because the objective is extraction of self-contained, executable modules rather than end-to-end training. No error-bar analysis for false positives is currently reported. We will revise the abstract to include a brief description of these validator rules. revision: yes
Referee: [Pipeline Description] Pipeline / validator-gated promotion: the description does not report integration or regime-shift tests on promoted blocks, leaving open the possibility that blocks compile in isolation but fail when transplanted due to untested cross-repository dependencies; this directly undermines the central claim of cross-repository migration and the 72% novelty contribution.

Authors: The manuscript's central mechanism—scope-aware dependency resolution combined with import-preserving reconstruction—explicitly produces dependency-complete modules, which directly supports the cross-repository migration claim and the 72% novelty contribution to LEMUR after multi-level de-duplication. The validator confirms that each promoted block is scope-closed and runnable in its extracted form. While explicit regime-shift or integration tests (e.g., retraining transplanted modules under new data regimes) are not reported in the current version, the reconstruction process is designed to enable such transplantation. We will add a short discussion with migration examples in the revised manuscript to make this support more explicit. revision: partial

Circularity Check

0 steps flagged

No significant circularity; engineering pipeline reports direct empirical outputs

full rationale

The paper presents a retrieval-augmented generation pipeline for extracting and validating neural modules from PyTorch repositories. Reported figures (1289 candidates, 941 validated at 73%, 72% novel structures in LEMUR) are direct counts obtained by applying the described extraction, scope-resolution, and validator steps to 19 external open-source repositories. No equations, fitted parameters, or derivations exist that reduce these counts to quantities defined inside the same work. Multi-level de-duplication and uniqueness claims follow from post-extraction analysis of the collected set rather than any self-referential loop. The validator is described as performing static and isolated runtime checks; its correctness is an external assumption, not a definitional identity. Self-citations, if present, do not carry the load of the central claims. The contribution is therefore self-contained as a systems description whose results are reproducible from the stated inputs and open repositories.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that the chosen 19 repositories are representative and that the validator rules are sufficient to guarantee executability after transplantation. No free parameters or invented physical entities are introduced.

axioms (2)

domain assumption PyTorch codebases can be parsed into scope-closed blocks whose dependencies are fully recoverable from import statements and call graphs.
Invoked when the pipeline performs scope-aware dependency resolution and import-preserving reconstruction.
domain assumption A static validator can reliably detect whether an extracted block will execute without runtime errors in a new context.
Central to the validator-gated promotion step that produces the 73% validation rate.

pith-pipeline@v0.9.0 · 5809 in / 1545 out tokens · 36147 ms · 2026-05-21T17:46:44.580336+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

NN-RAG performs scope-aware dependency resolution, import-preserving reconstruction, and validator-gated promotion—ensuring that every retrieved block is scope-closed, compilable, and runnable.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multi-level de-duplication (exact, lexical, structural) ... 72.46% of the unique set are NN-RAG extractions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs
cs.LG 2026-05 unverdicted novelty 7.0

Fine-tuned 7B LLMs generating unified diffs for neural architecture refinement achieve 66-75% valid rates and 64-66% mean first-epoch accuracy, outperforming full-generation baselines by large margins while cutting ou...
Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models
cs.CV 2026-01 unverdicted novelty 6.0

Closed-loop LLM search with AST-generated examples discovers non-standard channel widths that improve vision model performance over initial architectures on CIFAR-100.
Enhancing LLM-Based Neural Network Generation: Few-Shot Prompting and Efficient Validation for Automated Architecture Design
cs.CV 2025-12 conditional novelty 6.0

Three-example few-shot prompting optimizes LLM-generated vision architectures while a whitespace-normalized hash provides 100x faster duplicate detection than AST parsing across seven benchmarks.
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
cs.LG 2025-11 unverdicted novelty 4.0

FractalNet automatically generates and tests over 1,200 CNN architectures based on recursive fractal templates, achieving up to 80.18% accuracy on CIFAR-10 after five training epochs.
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis
cs.LG 2025-11 unverdicted novelty 3.0

Fractal templates enable systematic creation of more than 1,200 neural network variants that show strong performance and computational efficiency when trained on CIFAR-10 for five epochs.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 4 Pith papers · 4 internal anchors

[1]

com / facebookresearch/detectron2, 2019

Detectron2: Fair’s next-generation detection and segmentation library.https : / / github . com / facebookresearch/detectron2, 2019

work page 2019
[2]

com / open - mmlab / mmdetection, 2019

Mmdetection: Openmmlab detection toolbox and bench- mark.https : / / github . com / open - mmlab / mmdetection, 2019

work page 2019
[3]

softwareheritage

Software heritage persistent identifiers (swhids).https: / / docs . softwareheritage . org / devel / swh - model/persistent-identifiers.html, 2021

work page 2021
[4]

12570, 2023

Repocoder (pdf).https://arxiv.org/pdf/2303. 12570, 2023

work page 2023
[5]

com / SWE - agent/SWE-agent, 2024

Swe-agent (github).https : / / github . com / SWE - agent/SWE-agent, 2024

work page 2024
[6]

swebench

Swe-bench lite.https : / / www . swebench . com / lite.html, 2024

work page 2024
[7]

concurrent.futures — launching parallel tasks (python docs).https://docs.python.org/3/library/ concurrent.futures.html, 2025

work page 2025
[8]

Hugging face hub — documentation overview.https:// huggingface.co/docs/hub/en/index, 2025

work page 2025
[9]

Prefix- aware exact/near/AST dedup + diversity top-up

Nn–dup: Neural network deduplication pipeline.https: //github.com/ABrain-One/nn-dup, 2025. Prefix- aware exact/near/AST dedup + diversity top-up

work page 2025
[10]

GitHub repository

Nn–rag: Retrieval-augmented generation for neural network code.https://github.com/ABrain- One/nn- rag, 2025. GitHub repository

work page 2025
[11]

https : / / docs

The python import system (language reference, section 5). https : / / docs . python . org / 3 / reference / import.html, 2025

work page 2025
[12]

[13]SQLite Documentation, 2025

Sqlite write-ahead logging.https://sqlite.org/ wal.html, 2025. [13]SQLite Documentation, 2025. Accessed 2025-11-01

work page 2025
[13]

Pytorch image models (timm) — official docs.https:// timm.fast.ai/, 2025

work page 2025
[14]

pytorch.org/docs/stable/hub.html, 2025

torch.hub — pytorch documentation.https://docs. pytorch.org/docs/stable/hub.html, 2025

work page 2025
[15]

Torchvision models and pre-trained weights.https : / / docs . pytorch . org / vision / main / models . html, 2025

work page 2025
[16]

wikipedia

Binomial proportion confidence interval — wilson score interval.https : / / en . wikipedia . org / wiki / Binomial_proportion_confidence_interval, 2025

work page 2025
[17]

ArXiv, abs/2410.06992

Rehab Aleithan and et al. Swe-bench+: Enhanced coding benchmark for llms.arXiv:2410.06992, 2024

work page arXiv 2024
[18]

Lemur neural net- work dataset: Towards seamless automl.arXiv preprint arXiv:2504.10552, 2025

Anonymous. LEMUR neural network dataset: Towards seamless automl.arXiv preprint arXiv:2504.10552, 2025. Authors anonymized for review

work page arXiv 2025
[19]

Large-scale near-deduplication behind bigcode

BigCode. Large-scale near-deduplication behind bigcode. https://huggingface.co/blog/dedup, 2023

work page 2023
[20]

arXiv preprint arXiv:2406.01304 (2024)

Dong Chen and et al. Coder: Issue resolving with multi- agent and task graphs.arXiv:2406.01304, 2024

work page arXiv 2024
[21]

MMDetection: Open MMLab Detection Toolbox and Benchmark

Kai Chen, Jiayue Huang, et al. Mmdetection: Open mmlab detection toolbox and benchmark.arXiv:1906.07155, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[22]

Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V

Ekin D. Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V . Le. Randaugment: Practical automated data augmenta- tion with a reduced search space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 702–703, 2020. [24]Git:git-cloneManual. Git Project, 2025. Accessed 2025-11-01

work page 2020
[23]

Deep code search

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. Deep code search. InICSE, 2018

work page 2018
[24]

Deep code search.DL ACM, 2018

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. Deep code search.DL ACM, 2018

work page 2018
[25]

Identity mappings in deep residual networks

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. InProceedings of the European Conference on Computer Vision (ECCV), pages 630–645. Springer, 2016

work page 2016
[26]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018

work page 2018
[27]

Weinberger

Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kil- ian Q. Weinberger. Deep networks with stochastic depth. In Proceedings of the European Conference on Computer Vi- sion (ECCV), pages 646–661. Springer, 2016

work page 2016
[28]

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. A survey on hallucination in large language models: Principles, taxon- omy, challenges, and open questions.https://arxiv. org/abs/2311.05232, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[29]

Deckard: Scalable and accurate tree-based detection of code clones

Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. Deckard: Scalable and accurate tree-based detection of code clones. InICSE, 2007

work page 2007
[30]

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Carlos E. Jimenez and et al. Swe-bench: Can language mod- els resolve real-world github issues?arXiv:2310.06770, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[31]

Retrieval-augmented gener- ation for knowledge-intensive NLP tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨uttler, Mike Lewis, Wen-tau Yih, Tim Rockt¨aschel, Sebas- tian Riedel, and Douwe Kiela. Retrieval-augmented gener- ation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems, pages 9459–9474, 2020

work page 2020
[32]

Libcst: Concrete syntax tree parser and transformer for python.https : / / libcst

Meta Open Source. Libcst: Concrete syntax tree parser and transformer for python.https : / / libcst . readthedocs.io/, 2024. Accessed 2025-11-03

work page 2024
[33]

Perry et al

N. Perry et al. Do users write more insecure code with ai assistants? InACM CCS, 2023

work page 2023
[34]

Improving reproducibility in machine learning research.Journal of Machine Learning Research, 2021

Joelle Pineau et al. Improving reproducibility in machine learning research.Journal of Machine Learning Research, 2021

work page 2021
[35]

importlib — the implemen- tation ofimport.https://docs.python.org/3/ library/importlib.html, 2024

Python Software Foundation. importlib — the implemen- tation ofimport.https://docs.python.org/3/ library/importlib.html, 2024

work page 2024
[36]

importlib.metadata — access package metadata.https://docs.python.org/3/ library/importlib.metadata.html, 2024

Python Software Foundation. importlib.metadata — access package metadata.https://docs.python.org/3/ library/importlib.metadata.html, 2024

work page 2024
[37]

ast — abstract syntax trees

Python Software Foundation. ast — abstract syntax trees. https://docs.python.org/3/library/ast. html, 2024

work page 2024
[38]

compile() — built-in func- tions.https://docs.python.org/3/library/ functions.html#compile, 2024

Python Software Foundation. compile() — built-in func- tions.https://docs.python.org/3/library/ functions.html#compile, 2024

work page 2024
[39]

Execution model — nam- ing and binding.https://docs.python.org/3/ reference/executionmodel.html, 2024

Python Software Foundation. Execution model — nam- ing and binding.https://docs.python.org/3/ reference/executionmodel.html, 2024

work page 2024
[40]

The import system.https: / / docs

Python Software Foundation. The import system.https: / / docs . python . org / 3 / reference / import . html, 2024. [43]PEP 508 — Dependency specification for Python Software Packaging. Python Software Foundation, 2025. Accessed 2025-11-01

work page 2024
[41]

The python import sys- tem.https://docs.python.org/3/reference/ import.html, 2025

Python Software Foundation. The python import sys- tem.https://docs.python.org/3/reference/ import.html, 2025. [45]concurrent.futures — Launching parallel tasks. Python Soft- ware Foundation, 2025. Accessed 2025-11-01. [46]graphlib — Functionality to operate with graph-like struc- tures. Python Software Foundation, 2025. Accessed 2025- 11-01. [47]hashlib ...

work page 2025
[42]

Extending pytorch.https : / / pytorch

PyTorch Contributors. Extending pytorch.https : / / pytorch . org / tutorials / advanced / cpp _ extension.html, 2024. C++/CUDA extensions and op- erator registration

work page 2024
[43]

Roy and James R

Chanchal K. Roy and James R. Cordy. Nicad: A next gener- ation clone detection tool. InCSER, 2009

work page 2009
[44]

Roy, and Cristina V

Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chan- chal K. Roy, and Cristina V . Lopes. Sourcerercc: Scaling code clone detection to big-code. InProceedings of the 38th International Conference on Software Engineering (ICSE), pages 1157–1168, 2016

work page 2016
[45]

Pep 8 — style guide for python code.https://peps

Guido van Rossum, Barry Warsaw, and Nicket al.Coghlan. Pep 8 — style guide for python code.https://peps. python.org/pep-0008/, 2025. Accessed 2025-11-01

work page 2025
[46]

Swe-agent: Agent-computer inter- faces enable automated software engineering

Jiawei Yang and et al. Swe-agent: Agent-computer inter- faces enable automated software engineering. InNeurIPS, 2024

work page 2024
[47]

Cutmix: Regular- ization strategy to train strong classifiers with localizable fea- tures

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regular- ization strategy to train strong classifiers with localizable fea- tures. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision (ICCV), pages 6023–6032, 2019

work page 2019
[48]

Repocoder: Repository-level code completion through itera- tive retrieval and generation.arXiv:2303.12570, 2023

Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, and Weizhu Chen. Repocoder: Repository-level code completion through itera- tive retrieval and generation.arXiv:2303.12570, 2023

work page arXiv 2023
[49]

mixup: Beyond Empirical Risk Minimization

Hongyi Zhang, Moustapha Ciss ´e, Yann N. Dauphin, and David Lopez-Paz. mixup: Beyond empirical risk minimiza- tion. InInternational Conference on Learning Representa- tions (ICLR), 2018. arXiv:1710.09412

work page internal anchor Pith review Pith/arXiv arXiv 2018
[50]

Making convolutional networks shift- invariant again

Richard Zhang. Making convolutional networks shift- invariant again. InProceedings of the 36th International Conference on Machine Learning (ICML), pages 7324–

work page

[1] [1]

com / facebookresearch/detectron2, 2019

Detectron2: Fair’s next-generation detection and segmentation library.https : / / github . com / facebookresearch/detectron2, 2019

work page 2019

[2] [2]

com / open - mmlab / mmdetection, 2019

Mmdetection: Openmmlab detection toolbox and bench- mark.https : / / github . com / open - mmlab / mmdetection, 2019

work page 2019

[3] [3]

softwareheritage

Software heritage persistent identifiers (swhids).https: / / docs . softwareheritage . org / devel / swh - model/persistent-identifiers.html, 2021

work page 2021

[4] [4]

12570, 2023

Repocoder (pdf).https://arxiv.org/pdf/2303. 12570, 2023

work page 2023

[5] [5]

com / SWE - agent/SWE-agent, 2024

Swe-agent (github).https : / / github . com / SWE - agent/SWE-agent, 2024

work page 2024

[6] [6]

swebench

Swe-bench lite.https : / / www . swebench . com / lite.html, 2024

work page 2024

[7] [7]

concurrent.futures — launching parallel tasks (python docs).https://docs.python.org/3/library/ concurrent.futures.html, 2025

work page 2025

[8] [8]

Hugging face hub — documentation overview.https:// huggingface.co/docs/hub/en/index, 2025

work page 2025

[9] [9]

Prefix- aware exact/near/AST dedup + diversity top-up

Nn–dup: Neural network deduplication pipeline.https: //github.com/ABrain-One/nn-dup, 2025. Prefix- aware exact/near/AST dedup + diversity top-up

work page 2025

[10] [10]

GitHub repository

Nn–rag: Retrieval-augmented generation for neural network code.https://github.com/ABrain- One/nn- rag, 2025. GitHub repository

work page 2025

[11] [11]

https : / / docs

The python import system (language reference, section 5). https : / / docs . python . org / 3 / reference / import.html, 2025

work page 2025

[12] [12]

[13]SQLite Documentation, 2025

Sqlite write-ahead logging.https://sqlite.org/ wal.html, 2025. [13]SQLite Documentation, 2025. Accessed 2025-11-01

work page 2025

[13] [13]

Pytorch image models (timm) — official docs.https:// timm.fast.ai/, 2025

work page 2025

[14] [14]

pytorch.org/docs/stable/hub.html, 2025

torch.hub — pytorch documentation.https://docs. pytorch.org/docs/stable/hub.html, 2025

work page 2025

[15] [15]

Torchvision models and pre-trained weights.https : / / docs . pytorch . org / vision / main / models . html, 2025

work page 2025

[16] [16]

wikipedia

Binomial proportion confidence interval — wilson score interval.https : / / en . wikipedia . org / wiki / Binomial_proportion_confidence_interval, 2025

work page 2025

[17] [17]

ArXiv, abs/2410.06992

Rehab Aleithan and et al. Swe-bench+: Enhanced coding benchmark for llms.arXiv:2410.06992, 2024

work page arXiv 2024

[18] [18]

Lemur neural net- work dataset: Towards seamless automl.arXiv preprint arXiv:2504.10552, 2025

Anonymous. LEMUR neural network dataset: Towards seamless automl.arXiv preprint arXiv:2504.10552, 2025. Authors anonymized for review

work page arXiv 2025

[19] [19]

Large-scale near-deduplication behind bigcode

BigCode. Large-scale near-deduplication behind bigcode. https://huggingface.co/blog/dedup, 2023

work page 2023

[20] [20]

arXiv preprint arXiv:2406.01304 (2024)

Dong Chen and et al. Coder: Issue resolving with multi- agent and task graphs.arXiv:2406.01304, 2024

work page arXiv 2024

[21] [21]

MMDetection: Open MMLab Detection Toolbox and Benchmark

Kai Chen, Jiayue Huang, et al. Mmdetection: Open mmlab detection toolbox and benchmark.arXiv:1906.07155, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906

[22] [22]

Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V

Ekin D. Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V . Le. Randaugment: Practical automated data augmenta- tion with a reduced search space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 702–703, 2020. [24]Git:git-cloneManual. Git Project, 2025. Accessed 2025-11-01

work page 2020

[23] [23]

Deep code search

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. Deep code search. InICSE, 2018

work page 2018

[24] [24]

Deep code search.DL ACM, 2018

Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. Deep code search.DL ACM, 2018

work page 2018

[25] [25]

Identity mappings in deep residual networks

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. InProceedings of the European Conference on Computer Vision (ECCV), pages 630–645. Springer, 2016

work page 2016

[26] [26]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018

work page 2018

[27] [27]

Weinberger

Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kil- ian Q. Weinberger. Deep networks with stochastic depth. In Proceedings of the European Conference on Computer Vi- sion (ECCV), pages 646–661. Springer, 2016

work page 2016

[28] [28]

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. A survey on hallucination in large language models: Principles, taxon- omy, challenges, and open questions.https://arxiv. org/abs/2311.05232, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[29] [29]

Deckard: Scalable and accurate tree-based detection of code clones

Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. Deckard: Scalable and accurate tree-based detection of code clones. InICSE, 2007

work page 2007

[30] [30]

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Carlos E. Jimenez and et al. Swe-bench: Can language mod- els resolve real-world github issues?arXiv:2310.06770, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[31] [31]

Retrieval-augmented gener- ation for knowledge-intensive NLP tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨uttler, Mike Lewis, Wen-tau Yih, Tim Rockt¨aschel, Sebas- tian Riedel, and Douwe Kiela. Retrieval-augmented gener- ation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems, pages 9459–9474, 2020

work page 2020

[32] [32]

Libcst: Concrete syntax tree parser and transformer for python.https : / / libcst

Meta Open Source. Libcst: Concrete syntax tree parser and transformer for python.https : / / libcst . readthedocs.io/, 2024. Accessed 2025-11-03

work page 2024

[33] [33]

Perry et al

N. Perry et al. Do users write more insecure code with ai assistants? InACM CCS, 2023

work page 2023

[34] [34]

Improving reproducibility in machine learning research.Journal of Machine Learning Research, 2021

Joelle Pineau et al. Improving reproducibility in machine learning research.Journal of Machine Learning Research, 2021

work page 2021

[35] [35]

importlib — the implemen- tation ofimport.https://docs.python.org/3/ library/importlib.html, 2024

Python Software Foundation. importlib — the implemen- tation ofimport.https://docs.python.org/3/ library/importlib.html, 2024

work page 2024

[36] [36]

importlib.metadata — access package metadata.https://docs.python.org/3/ library/importlib.metadata.html, 2024

Python Software Foundation. importlib.metadata — access package metadata.https://docs.python.org/3/ library/importlib.metadata.html, 2024

work page 2024

[37] [37]

ast — abstract syntax trees

Python Software Foundation. ast — abstract syntax trees. https://docs.python.org/3/library/ast. html, 2024

work page 2024

[38] [38]

compile() — built-in func- tions.https://docs.python.org/3/library/ functions.html#compile, 2024

Python Software Foundation. compile() — built-in func- tions.https://docs.python.org/3/library/ functions.html#compile, 2024

work page 2024

[39] [39]

Execution model — nam- ing and binding.https://docs.python.org/3/ reference/executionmodel.html, 2024

Python Software Foundation. Execution model — nam- ing and binding.https://docs.python.org/3/ reference/executionmodel.html, 2024

work page 2024

[40] [40]

The import system.https: / / docs

Python Software Foundation. The import system.https: / / docs . python . org / 3 / reference / import . html, 2024. [43]PEP 508 — Dependency specification for Python Software Packaging. Python Software Foundation, 2025. Accessed 2025-11-01

work page 2024

[41] [41]

The python import sys- tem.https://docs.python.org/3/reference/ import.html, 2025

Python Software Foundation. The python import sys- tem.https://docs.python.org/3/reference/ import.html, 2025. [45]concurrent.futures — Launching parallel tasks. Python Soft- ware Foundation, 2025. Accessed 2025-11-01. [46]graphlib — Functionality to operate with graph-like struc- tures. Python Software Foundation, 2025. Accessed 2025- 11-01. [47]hashlib ...

work page 2025

[42] [42]

Extending pytorch.https : / / pytorch

PyTorch Contributors. Extending pytorch.https : / / pytorch . org / tutorials / advanced / cpp _ extension.html, 2024. C++/CUDA extensions and op- erator registration

work page 2024

[43] [43]

Roy and James R

Chanchal K. Roy and James R. Cordy. Nicad: A next gener- ation clone detection tool. InCSER, 2009

work page 2009

[44] [44]

Roy, and Cristina V

Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chan- chal K. Roy, and Cristina V . Lopes. Sourcerercc: Scaling code clone detection to big-code. InProceedings of the 38th International Conference on Software Engineering (ICSE), pages 1157–1168, 2016

work page 2016

[45] [45]

Pep 8 — style guide for python code.https://peps

Guido van Rossum, Barry Warsaw, and Nicket al.Coghlan. Pep 8 — style guide for python code.https://peps. python.org/pep-0008/, 2025. Accessed 2025-11-01

work page 2025

[46] [46]

Swe-agent: Agent-computer inter- faces enable automated software engineering

Jiawei Yang and et al. Swe-agent: Agent-computer inter- faces enable automated software engineering. InNeurIPS, 2024

work page 2024

[47] [47]

Cutmix: Regular- ization strategy to train strong classifiers with localizable fea- tures

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regular- ization strategy to train strong classifiers with localizable fea- tures. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision (ICCV), pages 6023–6032, 2019

work page 2019

[48] [48]

Repocoder: Repository-level code completion through itera- tive retrieval and generation.arXiv:2303.12570, 2023

Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, and Weizhu Chen. Repocoder: Repository-level code completion through itera- tive retrieval and generation.arXiv:2303.12570, 2023

work page arXiv 2023

[49] [49]

mixup: Beyond Empirical Risk Minimization

Hongyi Zhang, Moustapha Ciss ´e, Yann N. Dauphin, and David Lopez-Paz. mixup: Beyond empirical risk minimiza- tion. InInternational Conference on Learning Representa- tions (ICLR), 2018. arXiv:1710.09412

work page internal anchor Pith review Pith/arXiv arXiv 2018

[50] [50]

Making convolutional networks shift- invariant again

Richard Zhang. Making convolutional networks shift- invariant again. InProceedings of the 36th International Conference on Machine Learning (ICML), pages 7324–

work page