MOD-Finder: Identify multi-omics data sets related to defined chemical exposure
Pith reviewed 2026-05-24 21:24 UTC · model grok-4.3
The pith
MOD-Finder automates searches across public databases for multi-omics data sets related to a given chemical exposure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present MOD-Finder, a web application that searches multiple public repositories for omics data sets associated with exposure to a user-specified chemical, returning the data sets together with assumed effect information in an accessible format.
What carries the argument
MOD-Finder, an R Shiny web service that maps chemical identifiers and queries omics databases automatically.
If this is right
- Researchers can locate related multi-omics data sets without performing repeated manual cross-database searches.
- Integration of transcriptome, proteome, and other layers for a given chemical exposure becomes more feasible.
- Users receive effect information alongside the data-set listings to guide further analysis.
- Both web access and local reuse are supported through the open-source release.
Where Pith is reading between the lines
- The mapping step may reduce naming inconsistencies that currently hinder large-scale comparisons in toxicology.
- The service could be extended to additional databases or omics types without changing its core workflow.
- Automated retrieval might enable systematic meta-analyses of chemical-response patterns across studies.
Load-bearing premise
Chemical names and identifiers can be mapped reliably enough across databases to retrieve the relevant data sets without missing important matches or including many false ones.
What would settle it
Running the tool on a well-studied chemical such as bisphenol A and checking whether all known published omics data sets appear in the results or whether many irrelevant ones are returned due to identifier mismatches.
read the original abstract
Summary: Integration of multi-omics data on chemical exposure of cells or organisms promises a more complete representation of the responding pathways than single omics data. Data of different omics layers, like transcriptome or proteome is deposited in different repositories. Additionally, precisely specifying a chemical of interest that was used in the exposure experiments suffers from different nomenclatures and non-uniquely mapping of chemical identifiers. The manual search for corresponding omics data sets of different layers for exposure with a chemical of interest is thus a tedious task. We have developed MOD-Finder (Multi-Omics Data set Finder) to efficiently search for chemical-related omics data sets in several publicly available databases in an automated manner. A plain and simple presentation of the returned omics data sets is augmented with effect information that are assumed to be triggered by the chemical of interest. Availability and Implementation: MOD-Finder is implemented in R using the Shiny package. The web service is available at https://webapp.ufz.de/mod_finder and the source code under the GNU GPL v3 license at https://github.com/yigbt/MOD-Finder. Supplementary information: Supplementary data are available at https://www.ufz.de/index.php?en=44919
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes MOD-Finder, an R/Shiny web application that automates searches across public omics repositories (e.g., GEO, ArrayExpress, PRIDE) for datasets linked to a user-specified chemical exposure. It returns matching multi-omics records augmented with effect annotations assumed to be triggered by the chemical, addressing the tedium of manual searches caused by inconsistent chemical nomenclatures and non-unique identifier mappings. The tool and source code are made publicly available.
Significance. If the mapping and search logic function as intended, the tool could reduce the manual effort required to locate multi-omics data for chemical-exposure studies and thereby support pathway-level integration across omics layers. The public release of both the web service and GPL-licensed source code is a clear strength that enables reproducibility and community extension.
major comments (2)
- [Abstract] Abstract: the manuscript explicitly identifies non-unique chemical-identifier mappings as the central obstacle to manual search, yet supplies no description of the synonym-resolution procedure (which synonym databases are queried, how conflicts are ranked or flagged to the user, or whether the tool simply forwards the raw input string). Without this, the automation claim reduces to parallel query submission rather than resolution of the stated problem.
- No section provides quantitative validation (precision, recall, or coverage) of search results on a benchmark set of chemicals with known ambiguous names; soundness therefore rests solely on the code release rather than demonstrated performance on the core use case.
minor comments (1)
- The supplementary information link is given only as a UFZ institutional page; a direct DOI or stable archive of the example queries and output would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment below and note planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: the manuscript explicitly identifies non-unique chemical-identifier mappings as the central obstacle to manual search, yet supplies no description of the synonym-resolution procedure (which synonym databases are queried, how conflicts are ranked or flagged to the user, or whether the tool simply forwards the raw input string). Without this, the automation claim reduces to parallel query submission rather than resolution of the stated problem.
Authors: We agree the abstract omits details on synonym resolution. The manuscript body references public chemical databases to handle nomenclature issues, but we will revise the abstract to briefly describe the mapping approach and expand the methods to specify queried resources (e.g., PubChem, ChEBI) and conflict-handling logic. revision: yes
-
Referee: [—] No section provides quantitative validation (precision, recall, or coverage) of search results on a benchmark set of chemicals with known ambiguous names; soundness therefore rests solely on the code release rather than demonstrated performance on the core use case.
Authors: We acknowledge that a benchmark evaluation would strengthen claims. However, constructing a gold-standard set of chemical-omics associations requires substantial manual curation not available in existing resources and lies outside the scope of this tool-description paper. The GPL-licensed code enables independent assessment. We will add a discussion paragraph on this limitation and avenues for future validation. revision: partial
Circularity Check
No circularity; software tool description without derivations or self-referential claims
full rationale
The paper describes the implementation of MOD-Finder, an R/Shiny web tool for automated querying of public omics repositories using chemical identifiers. No equations, fitted parameters, predictions, uniqueness theorems, or ansatzes appear. The abstract and implementation section state the problem of non-unique chemical mappings but present the tool as a convenience layer for query submission and result display; no claim is made that the tool resolves ambiguities via any derived procedure. No self-citations are load-bearing. The contribution is self-contained as a software artifact and does not reduce any result to its inputs by construction.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.