Beyond Takedown: Measuring Malicious Go Module Persistence in the Wild
Pith reviewed 2026-06-26 01:25 UTC · model grok-4.3
The pith
Malicious Go modules stay retrievable via proxy in at least 99.4 percent of cases after GitHub removal.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We identified 2,289 malicious versions of legitimate Go modules. Purely GitHub-centric searches fail to identify the full extent of the compromise and are only effective for as long as the affected code is present on the platform. Among artifacts later found to be GitHub-unobservable, at least 99.4 percent remained retrievable via Go proxy. Following disclosure, GitHub removed 684 malicious repositories and the Google Go team remediated 1,377 module versions.
What carries the argument
The GOAST deobfuscating AST scanner that locates obfuscated import-triggered downloader code across large module indexes.
If this is right
- GitHub takedowns alone leave most malicious modules reachable through the proxy.
- Supply-chain campaigns can be measured only when both platform and proxy sources are examined.
- Disclosure to the Go team and GitHub produced concrete removals of 2,061 items.
- Repackaged modules with hidden downloaders represent a repeatable attack pattern in the Go ecosystem.
Where Pith is reading between the lines
- Package proxies in other languages may exhibit similar post-takedown persistence.
- Routine scanning of proxy mirrors could close the remediation gap earlier than GitHub monitoring alone.
- Module consumers could add proxy-origin checks to their dependency tools.
Load-bearing premise
The GOAST scanner detects the obfuscated malicious code with low false-positive and false-negative rates over the full 12.3 million index entries.
What would settle it
A random sample audit of the flagged versions that finds a substantial fraction are not malicious, or a direct check showing many GitHub-removed modules are unavailable through the Go proxy.
Figures
read the original abstract
We measure an automation-based supply chain campaign in the Go ecosystem. The attackers repackage legitimate Go modules under attacker-controlled owners, and embed them with obfuscated code for an import-triggered downloader. Our results come from two complementary analyses: a) a manual search on GitHub across 2,113 repositories and b) a large-scale scan of 12.3M index entries using a deobfuscating AST scanner (GOAST) that we implemented. As a result, we identified 2,289 malicious versions of legitimate Go modules. We demonstrate that purely GitHub-centric searches fail to identify the full extent of the compromise and are only effective for as long as the affected code is present on the platform. Moreover, our proxy-based measurements of the takedown-remediation gap reveal that among artifacts later found to be GitHub-unobservable (i.e., removed or suspended), at least 99.4% remained retrievable via Go proxy. Following our disclosure, GitHub has removed 684 malicious repositories and the Google Go team has remediated 1,377 module versions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper measures an automation-based supply chain attack in the Go ecosystem in which attackers repackage legitimate modules under new owners and embed obfuscated import-triggered downloaders. It reports results from two complementary methods—a manual GitHub search across 2,113 repositories and a large-scale scan of 12.3 M index entries with a custom deobfuscating AST scanner (GOAST)—yielding 2,289 malicious versions. The work further shows that among artifacts later found to be GitHub-unobservable, at least 99.4 % remained retrievable via the Go proxy, and notes post-disclosure remediation actions by GitHub and the Go team.
Significance. If the quantitative claims hold, the study supplies the first large-scale empirical evidence of malicious-module persistence across the Go proxy after GitHub takedowns, demonstrating the practical limits of platform-centric remediation. The dual-method design and concrete counts (2,289 versions, 99.4 % proxy persistence) are directly usable by the security community for threat modeling and proxy policy.
major comments (1)
- [Large-scale scan / GOAST methodology] The section describing the large-scale scan (GOAST) states that the scanner performs deobfuscating AST analysis to detect import-triggered downloader code across 12.3 M entries and produces the headline count of 2,289 malicious versions, yet supplies no precision/recall figures, no ground-truth labeled corpus size, no false-positive audit on a random sample, and no comparison against an independent detector. Because the 2,289 count and the conditioned 99.4 % persistence statistic rest directly on this unvalidated detector, the central quantitative claims cannot be assessed without these metrics.
minor comments (2)
- [Abstract and Results] The abstract and results paragraphs report exact counts (2,289, 684, 1,377) without accompanying confidence intervals or sensitivity analysis; adding these would strengthen the presentation even if the underlying detector validation is supplied.
- [Manual GitHub search] The manual-search methodology (2,113 repositories) is described at a high level; a brief enumeration of the search keywords or repository-selection criteria would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the GOAST methodology. We agree that additional validation details are needed to support the quantitative claims and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: The section describing the large-scale scan (GOAST) states that the scanner performs deobfuscating AST analysis to detect import-triggered downloader code across 12.3 M entries and produces the headline count of 2,289 malicious versions, yet supplies no precision/recall figures, no ground-truth labeled corpus size, no false-positive audit on a random sample, and no comparison against an independent detector. Because the 2,289 count and the conditioned 99.4 % persistence statistic rest directly on this unvalidated detector, the central quantitative claims cannot be assessed without these metrics.
Authors: We acknowledge that the submitted manuscript does not report precision/recall, ground-truth corpus size, or a false-positive audit for GOAST. The manual search over 2,113 repositories was used to develop and iteratively refine the deobfuscation rules in GOAST, but these steps were not quantified in the text. In the revision we will insert a dedicated validation subsection that (1) states the size of the manually labeled corpus, (2) reports precision and recall on a held-out portion of that corpus, (3) presents the outcome of a random-sample false-positive audit, and (4) explains why an independent public detector for this exact obfuscated pattern was unavailable for comparison. The 99.4 % persistence figure is measured directly from proxy retrieval attempts on the subset of modules already confirmed malicious by either method; it does not depend on GOAST’s detection rate. revision: yes
Circularity Check
No circularity: pure empirical measurement study
full rationale
The paper reports direct counts from manual GitHub repository searches (2,113 repos) and a large-scale scan of 12.3M index entries via the GOAST deobfuscating AST scanner, yielding 2,289 malicious versions and a 99.4% proxy persistence statistic. No equations, fitted parameters, predictions, ansatzes, uniqueness theorems, or self-citation chains appear in the methodology or results. The 2,289 count and downstream statistics are raw empirical outputs conditioned on the scan, not reductions by construction to prior inputs or self-referential definitions. Scanner accuracy concerns are validity issues, not circularity.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The Go module index contains 12.3M entries that accurately reflect published modules and the proxy serves the same artifacts as the index.
- domain assumption Manual review of 2,113 repositories provides a representative sample for identifying the campaign pattern.
Reference graph
Works this paper leans on
-
[1]
Google Cloud:Go on Google Cloud.https://cloud.google.com/go, last accessed 2026/04/10
2026
-
[2]
A Serverless and Go Journey – Evolution of the Capital One Credit Offers API
Capital One Tech: “A Serverless and Go Journey – Evolution of the Capital One Credit Offers API.”https://medium.com/capital-one-tech/ a-serverless-and-go-journey-credit-offers-api-74ef1f9fde7f, last accessed 2026/04/10
2026
-
[3]
Is Golang Still Growing? Go Language Popularity Trends in 2024
JetBrains: “Is Golang Still Growing? Go Language Popularity Trends in 2024.”https://blog.jetbrains.com/research/2025/04/ is-golang-still-growing-go-language-popularity-trends-in-2024, last accessed 2026/04/10
2024
-
[4]
The Go Project:Go for Cloud & Network Serviceshttps://go.dev/solutions/ cloud, last accessed 2026/04/10
2026
-
[5]
How to go from untrusted open source compo- nents to trusted artifacts
ActiveState: “How to go from untrusted open source compo- nents to trusted artifacts”https://www.activestate.com/blog/ how-to-convert-untrusted-open-source-components-into-trusted-artifacts/, last accessed 2026/04/10
2026
-
[6]
ENISA THREAT LAND- SCAPE FOR SUPPLY CHAIN ATTACKS
European Union Agency for Cybersecurity (ENISA): “ENISA THREAT LAND- SCAPE FOR SUPPLY CHAIN ATTACKS.” ENISA Report, July 2021. https://www.enisa.europa.eu/sites/default/files/publications/ENISA% 20Threat%20Landscape%20for%20Supply%20Chain%20Attacks.pdf, last accessed 2026/04/10
2021
-
[7]
State of the Software Supply Chain Report
Sonatype, Inc.: “State of the Software Supply Chain Report”.https: //www.sonatype.com/state-of-the-software-supply-chain/Introduction, last accessed 2026/04/10
2026
-
[8]
Widespread Sup- ply Chain Compromise Impacting npm Ecosystem
CISA (Cybersecurity and Infrastructure Security Agency). “Widespread Sup- ply Chain Compromise Impacting npm Ecosystem.” CISA Alerts, Septem- ber 23, 2025.https://www.cisa.gov/news-events/alerts/2025/09/23/ widespread-supply-chain-compromise-impacting-npm-ecosystem, last accessed 2026/04/10
2025
-
[9]
npm, Inc.:Details about the event-stream incident.https://blog.npmjs.org/ post/180565383195/details-about-the-event-stream-incident, last accessed 2026/04/10
arXiv 2026
-
[10]
Compromised PyTorch-nightly dependency chain be- tween December 25th and December 30th, 2022
PyTorch Foundation: “Compromised PyTorch-nightly dependency chain be- tween December 25th and December 30th, 2022.”https://pytorch.org/blog/ compromised-nightly-dependency/, last accessed 2026/04/10
2022
-
[11]
Someone copied our GitHub project, made it look more trustworthy by adding stars from many fake users, and then injected malicious code at runtime for potential users
Reddit: “Someone copied our GitHub project, made it look more trustworthy by adding stars from many fake users, and then injected malicious code at runtime for potential users.” r/golang, Reddit, 2025.https://www.reddit.com/r/golang/ comments/1jbzuot/someone_copied_our_github_project_made_it_look, last ac- cessed 2026/04/10
2025
-
[12]
Attacks on Maven Proxy Repositories
GitHub Security Blog: “Attacks on Maven Proxy Repositories.” January 22, 2025. https://github.blog/security/vulnerability-research/ attacks-on-maven-proxy-repositories/, last accessed 2026/04/10
2025
-
[13]
He, H., Yang, H., Burckhardt, P., Kapravelos, A., Vasilescu, B., Kästner, C.: Six Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Spams, and Malware. arXiv preprint arXiv:2412.13459 (2024).https://arxiv.org/ abs/2412.13459 Beyond Takedown: Measuring Malicious Go Module Persistence in the Wild 19
arXiv 2024
-
[14]
In: Proceedings of the 36th An- nual Computer Security Applications Conference (ACSAC ’20), pp
Du, K., Yang, H., Zhang, Y., Duan, H., Wang, H., Hao, S., Li, Z., Yang, M.: Un- derstanding Promotion-as-a-Service on GitHub. In: Proceedings of the 36th An- nual Computer Security Applications Conference (ACSAC ’20), pp. 597 – 610. https://dl.acm.org/doi/10.1145/3427228.3427258
-
[15]
The Go Project:Package initialization.https://go.dev/ref/spec#Package_ initialization, last accessed 2026/04/10
2026
-
[16]
Millions of Malicious Repositories Flood GitHub
Nelson, N. “Millions of Malicious Repositories Flood GitHub.” Dark Read- ing, March 4, 2024.https://www.darkreading.com/application-security/ millions-of-malicious-repositories-flood-github, last accessed 2026/04/10
2024
-
[17]
GitHub Docs:Reporting abuse or spam.https://docs
GitHub, Inc. GitHub Docs:Reporting abuse or spam.https://docs. github.com/en/communities/maintaining-your-safety-on-github/ reporting-abuse-or-spam, last accessed 2026/04/10
2026
-
[18]
Over 100,000 Infected Repos Found on GitHub
Giladi, M. and David, G.. “Over 100,000 Infected Repos Found on GitHub.” Apiiro Blog, February 28, 2024.https://apiiro.com/blog/ malicious-code-campaign-github-repo-confusion-attack/, last accessed 2026/04/10
2024
-
[19]
Flash Report: GitHub Repositories Targeted in Mali- cious Cyber Activity
ZeroFox Intelligence: “Flash Report: GitHub Repositories Targeted in Mali- cious Cyber Activity”. March 19, 2025.https://www.zerofox.com/intelligence/ flash-report-github-repositories-targeted-in-malicious-cyber-activity/, last accessed 2026/04/10
2025
-
[20]
The Go Project:Module proxies.https://go.dev/ref/mod#module-proxy, last ac- cessed 2026/04/10
2026
-
[21]
The Go Project:Checksum database.https://go.dev/ref/mod# checksum-database, last accessed 2026/04/10
2026
-
[22]
Go Supply Chain Attack: Ma- licious Package Exploits Go Module Proxy Caching for Persis- tence
Socket Security Research Team. “Go Supply Chain Attack: Ma- licious Package Exploits Go Module Proxy Caching for Persis- tence.” Socket Blog, February 2025.https://socket.dev/blog/ malicious-package-exploits-go-module-proxy-caching-for-persistence, last accessed 2026/04/10
2025
-
[23]
dev/blog/module-mirror-launch, last accessed 2026/04/10
The Go Project:Module Mirror and Checksum Database Launchedhttps://go. dev/blog/module-mirror-launch, last accessed 2026/04/10
2026
-
[24]
The Go Project:Go Module Index.https://index.golang.org, last accessed 2026/04/10
2026
-
[25]
The Go Project:Go Vulnerability Databasehttps://go.dev/doc/security/vuln/ database, last accessed 2026/04/10
2026
-
[26]
In: 28th USENIX Security Symposium (USENIX Security ’19), pp
Zimmermann,M.,Staicu,C.A.,Tenny,C.,Pradel,M.:SmallworldwithHighRisks: A Study of Security Threats in the npm Ecosystem. In: 28th USENIX Security Symposium (USENIX Security ’19), pp. 995 – 1010.https://dl.acm.org/doi/10. 5555/3361338.3361407
-
[27]
In: IEEE Symposium on Security and Privacy (SP ’23), pp
Ladisa,P.,Plate,H.,Martinez,M.,Barais,O.:SoK:TaxonomyofAttacksonOpen- Source Software Supply Chains. In: IEEE Symposium on Security and Privacy (SP ’23), pp. 1509-1526.https://arxiv.org/abs/2204.04008v2
-
[28]
In: Network and Distributed System Security (NDSS ’21) Symposium.https://www.ndss-symposium.org/wp-content/uploads/ndss2021_ 1B-1_23055_paper.pdf
Duan, R., Alrawi, O., Kasturi, R.P., Elder, R., Saltaformaggio, B., Lee, W.: Towards Measuring Supply Chain Attacks on Package Managers for Inter- preted Languages. In: Network and Distributed System Security (NDSS ’21) Symposium.https://www.ndss-symposium.org/wp-content/uploads/ndss2021_ 1B-1_23055_paper.pdf
-
[29]
Backstabber's knife collection: A review of open source software supply chain attacks
Ohm, M., Plate, H., Sykosch, A., Meier, M.: Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks. In: 17th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA ’20), pp. 23 – 43.https://dl.acm.org/doi/10.1007/978-3-030-52683-2_2 20 M. Bae and C. Yagemann
-
[30]
In: Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED ’24), pp
Cesarano, C., Andersson, V., Natella, R., Monperrus, M.: GoSurf: Identifying Software Supply Chain Attack Vectors in Go. In: Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED ’24), pp. 33 – 42. https://dl.acm.org/doi/10.1145/3689944.3696166
-
[31]
Cesarano, C., Monperrus, M., Natella, R.: GoLeash: Mitigating Golang Software Supply Chain Attacks with Runtime Policy Enforcement. arXiv preprint (2025). https://arxiv.org/abs/2505.11016
arXiv 2025
-
[32]
Hu, J., Zhang, L., Liu, C., Yang, S., Huang, S., Liu, Y.: Empirical Analysis of Vulnerabilities Life Cycle in Golang Ecosystem. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24), pp. 1 – 13. https://doi.org/10.1145/3597503.3639230 Beyond Takedown: Measuring Malicious Go Module Persistence in the Wild 21 Appen...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.