WiseOWL: A Methodology for Evaluating Ontological Descriptiveness and Semantic Correctness for Ontology Reuse and Ontology Recommendations
Pith reviewed 2026-05-10 15:48 UTC · model grok-4.3
The pith
WiseOWL scores ontologies on documentation coverage, label alignment via embeddings, structural connections, and hierarchical balance to support reuse decisions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
WiseOWL is a methodology for evaluating ontological descriptiveness and semantic correctness that computes four metrics: Well-Described for documentation coverage, Well-Defined using state-of-the-art embeddings to measure label-definition alignment, Connection for structural interconnectedness, and Hierarchical Breadth for hierarchical balance. The system delivers normalized scores between 0 and 10 with actionable feedback, is realized as a Streamlit app that ingests OWL, converts to RDF Turtle, and supplies visualizations, and demonstrates promising effectiveness when applied to the Plant Ontology, Gene Ontology, Semanticscience Integrated Ontology, Food Ontology, Dublin Core, and GoodRelat
What carries the argument
The WiseOWL four-metric scoring system that quantifies descriptiveness and correctness to guide ontology selection and reuse.
If this is right
- Ontology selection moves from intuition to a reproducible scoring process that can be defended in project decisions.
- Developers receive specific feedback on which aspects of an ontology need improvement before reuse.
- The Streamlit implementation enables quick, interactive assessment and visualization without custom tooling.
- Normalized 0-10 scores allow direct comparison across ontologies from different domains.
- Consistent reuse reduces duplication and supports more reliable machine-operable semantic content.
Where Pith is reading between the lines
- The embedding-based check for label-definition alignment could be updated as new language models appear, potentially improving the Well-Defined metric over time.
- Extending the evaluation to larger sets of ontologies or additional domains might show whether metric weights need domain-specific tuning.
- The methodology could be combined with usage statistics from ontology repositories to create hybrid recommendation systems.
- If the scores predict real-world reuse success, they might serve as a lightweight quality filter before deeper manual review.
Load-bearing premise
The four metrics together capture what makes an ontology suitable for reuse and semantically correct, and that results from six test cases are sufficient to establish the method's effectiveness.
What would settle it
An independent ranking of the same six ontologies by domain experts for reuse suitability that shows no correlation with the WiseOWL scores, or a controlled reuse experiment where high-scoring ontologies produce more inconsistencies in integrated applications than low-scoring ones.
Figures
read the original abstract
The Semantic Web standardizes concept meaning for humans and machines, enabling machine-operable content and consistent interpretation that improves advanced analytics. Reusing ontologies speeds development and enforces consistency, yet selecting the optimal choice is challenging because authors lack systematic selection criteria and often rely on intuition that is difficult to justify, limiting reuse. To solve this, WiseOWL is proposed, a methodology with scoring and guidance to select ontologies for reuse. It scores four metrics: (i) Well-Described, measuring documentation coverage; (ii) Well-Defined, using state-of-the-art embeddings to assess label-definition alignment; (iii) Connection, capturing structural interconnectedness; and (iv) Hierarchical Breadth, reflecting hierarchical balance. WiseOWL outputs normalized 0-10 scores with actionable feedback. Implemented as a Streamlit app, it ingests OWL format, converts to RDF Turtle, and provides interactive visualizations. Evaluation across six ontologies, including the Plant Ontology (PO), Gene Ontology (GO), Semanticscience Integrated Ontology (SIO), Food Ontology (FoodON), Dublin Core (DC), and GoodRelations, demonstrates promising effectiveness.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes WiseOWL, a methodology to support ontology reuse and recommendation by scoring four metrics: Well-Described (documentation coverage), Well-Defined (embedding-based assessment of label-definition alignment), Connection (structural interconnectedness), and Hierarchical Breadth (hierarchical balance). Scores are normalized to a 0-10 scale with actionable feedback; the approach is implemented as a Streamlit app that ingests OWL files and produces visualizations. Evaluation on six ontologies (PO, GO, SIO, FoodON, DC, GoodRelations) is reported to demonstrate promising effectiveness.
Significance. If the metrics receive precise, reproducible definitions and the evaluation is expanded with quantitative results, baselines, and external validation, WiseOWL could address a genuine practical need in the Semantic Web by offering a systematic alternative to intuition-based ontology selection. The interactive tool implementation is a concrete strength that could facilitate adoption and further testing.
major comments (3)
- [§3] §3 (Metric definitions): The four metrics are introduced at a high level without equations, algorithms, or parameter specifications. For instance, the Well-Defined metric invokes 'state-of-the-art embeddings' to assess label-definition alignment but supplies no model, similarity function, aggregation method, or threshold; this is load-bearing because the central claim that WiseOWL measures semantic correctness cannot be evaluated or reproduced without these details.
- [§5] §5 (Evaluation): The evaluation states that the six ontologies yield 'promising effectiveness' yet reports no tabulated metric scores, no statistical tests, no error analysis, and no comparison to any baseline selection heuristic or prior method. This directly undermines the claim that the methodology has been shown to be effective.
- [§3.2] §3.2 (Well-Defined metric): No external validation (e.g., correlation with expert ratings of semantic correctness or observed reuse frequency) is provided for any metric, including the embedding-based one. Without such grounding, it remains unclear whether high scores actually predict better ontology reuse, which is required for the methodology's stated purpose.
minor comments (2)
- [Abstract and §4] The normalization procedure that maps raw metric values to the 0-10 scale is mentioned in the abstract but not specified in the main text; adding the exact transformation (including any free parameters) would improve reproducibility.
- [§4] The Streamlit app description could usefully include screenshots or explicit details on the interactive visualizations and the form of the actionable feedback generated for each metric.
Simulated Author's Rebuttal
We thank the referee for the insightful comments on our manuscript describing the WiseOWL methodology. We address each of the major comments below and indicate the revisions we plan to make to strengthen the paper.
read point-by-point responses
-
Referee: [§3] §3 (Metric definitions): The four metrics are introduced at a high level without equations, algorithms, or parameter specifications. For instance, the Well-Defined metric invokes 'state-of-the-art embeddings' to assess label-definition alignment but supplies no model, similarity function, aggregation method, or threshold; this is load-bearing because the central claim that WiseOWL measures semantic correctness cannot be evaluated or reproduced without these details.
Authors: We agree with the referee that the metric definitions in section 3 are currently described at a conceptual level without sufficient formal details. To address this, we will revise the manuscript to include explicit equations for each metric, detailed algorithms (including pseudocode), and specific parameter values. For the Well-Defined metric, we will specify the embedding model used (such as a particular Sentence-BERT variant), the similarity function (cosine similarity), the method for aggregating alignment scores across labels and definitions, and any decision thresholds. These additions will make the methodology fully reproducible and allow readers to evaluate the semantic correctness claims. revision: yes
-
Referee: [§5] §5 (Evaluation): The evaluation states that the six ontologies yield 'promising effectiveness' yet reports no tabulated metric scores, no statistical tests, no error analysis, and no comparison to any baseline selection heuristic or prior method. This directly undermines the claim that the methodology has been shown to be effective.
Authors: We acknowledge that the evaluation section is limited and does not provide the quantitative details necessary to robustly support the effectiveness claim. In the revised manuscript, we will include a table presenting the normalized scores for all four metrics across the six ontologies. We will also add comparisons to baseline approaches, such as selecting ontologies based on their size or the presence of documentation alone, and include basic statistical analysis of the scores. An error analysis discussing any discrepancies or limitations observed will be incorporated. These changes will provide a more solid foundation for the 'promising effectiveness' statement. revision: yes
-
Referee: [§3.2] §3.2 (Well-Defined metric): No external validation (e.g., correlation with expert ratings of semantic correctness or observed reuse frequency) is provided for any metric, including the embedding-based one. Without such grounding, it remains unclear whether high scores actually predict better ontology reuse, which is required for the methodology's stated purpose.
Authors: We recognize that external validation is essential to confirm that the metrics, particularly Well-Defined, correlate with actual ontology quality and reuse success. The current work applies the metrics to well-known ontologies and interprets the results qualitatively. For the revision, we will expand the discussion to include the importance of external validation and outline potential methods for future work, such as correlating scores with expert judgments or reuse statistics from ontology repositories. However, conducting a full empirical validation study is outside the scope of this methodology paper and would require substantial additional effort and data access. We will explicitly note this limitation in the revised text. revision: partial
- Conducting a comprehensive external validation study correlating metric scores with expert ratings or observed reuse frequencies, as this requires new data collection and analysis beyond the current manuscript's scope.
Circularity Check
No circularity: WiseOWL metrics rely on independent external techniques without self-referential reduction or load-bearing self-citations.
full rationale
The paper defines WiseOWL as a scoring methodology using four metrics computed from standard ontology properties and external components: documentation coverage for Well-Described, state-of-the-art embeddings for label-definition alignment in Well-Defined, graph-based structural measures for Connection, and hierarchical balance for Hierarchical Breadth. These are presented as direct computations with normalized 0-10 outputs and no equations or definitions that reduce outputs to inputs by construction. The evaluation on six ontologies is described as a demonstration of effectiveness rather than a fitted prediction or self-referential validation. No self-citations are invoked to justify uniqueness or core premises, and the methodology draws on independent techniques (embeddings, graph analysis) without smuggling ansatzes or renaming known results. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Normalization to 0-10 scale
axioms (2)
- domain assumption State-of-the-art embeddings can accurately assess semantic alignment between labels and definitions
- domain assumption Structural graph measures and hierarchy balance metrics are meaningful indicators of ontology quality
Reference graph
Works this paper leans on
-
[1]
Gliide: Global-local image integration via descriptive extraction,
A. S. Dalal, S. Abadifard, and H. K. McGinty, “Gliide: Global-local image integration via descriptive extraction,” inProceedings of the 13th Knowledge Capture Conference 2025, 2025, pp. 194–197
work page 2025
-
[2]
Flavonoid fusion: Creating a knowledge graph to unveil the interplay between food and health,
A. Singh Dalal, Y . Zhang, D. Do ˘gan, A. Mert ˙Ileri, and H. K ¨uc ¸¨uk McGinty, “Flavonoid fusion: Creating a knowledge graph to unveil the interplay between food and health,”arXiv e-prints, pp. arXiv–2510, 2025
work page 2025
-
[3]
A. G ´omez-P´erez, M. Fern ´andez-L´opez, and O. Corcho,Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Springer Science & Business Media, 2006
work page 2006
-
[4]
Ontology development 101: A guide to creating your first ontology,
N. F. Noy, D. L. McGuinnesset al., “Ontology development 101: A guide to creating your first ontology,” 2001
work page 2001
-
[5]
A methodology for ontology integra- tion,
H. S. Pinto and J. P. Martins, “A methodology for ontology integra- tion,” inProceedings of the 1st international conference on Knowledge capture, 2001, pp. 131–138
work page 2001
-
[6]
Computer vision based automated quantification of agricultural sprayers boom displacement,
A. S. Dalal, S. Rai, R. Singh, T. S. Kaloya, R. H. Cheppally, and A. Sharda, “Computer vision based automated quantification of agricultural sprayers boom displacement,”Computers and Electronics in Agriculture, vol. 243, p. 111341, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0168169925014474
work page 2026
-
[7]
The prompt suite: interactive tools for ontology merging and mapping,
N. F. Noy and M. A. Musen, “The prompt suite: interactive tools for ontology merging and mapping,”International journal of human- computer studies, vol. 59, no. 6, pp. 983–1024, 2003
work page 2003
-
[8]
A theoretical framework for ontology evaluation and validation
A. Gangemi, C. Catenacci, M. Ciaramita, and J. Lehmann, “A theoretical framework for ontology evaluation and validation.” inSWAP, vol. 166, 2005, p. 16
work page 2005
-
[9]
A survey of ontology evaluation techniques,
J. Brank, M. Grobelnik, and D. Mladenic, “A survey of ontology evaluation techniques,” inProceedings of the conference on data mining and data warehouses (SiKDD 2005). Citeseer, 2005, pp. 166–170
work page 2005
-
[10]
Methontology: from ontological art towards ontological engineering,
M. Fern ´andez-L´opez, A. G ´omez-P´erez, and N. Juristo Juzgado, “Methontology: from ontological art towards ontological engineering,” 1997
work page 1997
-
[11]
H. S. Pinto, S. Staab, and C. Tempich, “Diligent: Towards a fine- grained methodology for distributed, loosely-controlled and evolving engineering of ontologies,” inECAI, vol. 16, 2004, p. 393
work page 2004
-
[12]
Neon methodology for building ontology networks: specification, scheduling and reuse,
M. C. Su ´arez-Figueroa, “Neon methodology for building ontology networks: specification, scheduling and reuse,” Ph.D. dissertation, In- formatica, 2010
work page 2010
-
[13]
A. Gangemi and V . Presutti, “Ontology design patterns,” inHandbook on ontologies. Springer, 2009, pp. 221–243
work page 2009
-
[14]
Echo-llm evidence-checked hierarchical ontology,
A. S. Dalal and H. McGinty, “Echo-llm evidence-checked hierarchical ontology,” inOpen Conference Proceedings, vol. 8, 2026
work page 2026
-
[15]
Knowledge acquisition and representation methodology (knarm) and its applications,
H. K. McGinty, “Knowledge acquisition and representation methodology (knarm) and its applications,” Ph.D. dissertation, University of Miami, 2018
work page 2018
-
[16]
Olive: Ontology learning with integrated vector embeddings,
Y . Zhang, A. S. Dalal, C. Martin, S. R. Gadusu, and H. K. McGinty, “Olive: Ontology learning with integrated vector embeddings,”Applied Ontology, p. 15705838251329268, 2024
work page 2024
-
[17]
Ontoqa: Metric-based ontology quality analysis,
S. Tartir, I. B. Arpinar, M. Moore, A. P. Sheth, and B. Aleman-Meza, “Ontoqa: Metric-based ontology quality analysis,” 2005
work page 2005
-
[18]
Ontometric: A method to choose the appropriate ontology,
A. Lozano-Tello and A. G ´omez-P´erez, “Ontometric: A method to choose the appropriate ontology,”Journal of Database Management (JDM), vol. 15, no. 2, pp. 1–18, 2004
work page 2004
-
[19]
Ncbo ontology recommender 2.0: an enhanced ap- proach for biomedical ontology recommendation,
M. Mart ´ınez-Romero, C. Jonquet, M. J. O’connor, J. Graybeal, A. Pazos, and M. A. Musen, “Ncbo ontology recommender 2.0: an enhanced ap- proach for biomedical ontology recommendation,”Journal of biomedical semantics, vol. 8, pp. 1–22, 2017
work page 2017
-
[20]
Supporting ontological analysis of taxonomic relationships,
C. Welty and N. Guarino, “Supporting ontological analysis of taxonomic relationships,”Data & knowledge engineering, vol. 39, no. 1, pp. 51–74, 2001
work page 2001
-
[21]
Aeon–an approach to the automatic evaluation of ontologies,
J. V ¨olker, D. Vrande ˇci´c, Y . Sure, and A. Hotho, “Aeon–an approach to the automatic evaluation of ontologies,”Applied Ontology, vol. 3, no. 1-2, pp. 41–62, 2008
work page 2008
-
[22]
A semiotic metrics suite for assessing the quality of ontologies,
A. Burton-Jones, V . C. Storey, V . Sugumaran, and P. Ahluwalia, “A semiotic metrics suite for assessing the quality of ontologies,”Data & Knowledge Engineering, vol. 55, no. 1, pp. 84–102, 2005
work page 2005
-
[23]
A conceptual model for ontology quality assessment: A systematic review,
R. Wilson, J. S. Goonetillake, W. Indika, and A. Ginige, “A conceptual model for ontology quality assessment: A systematic review,”Semantic Web, vol. 14, no. 6, pp. 1051–1097, 2023
work page 2023
-
[24]
User-driven quality evaluation of dbpedia,
A. Zaveri, D. Kontokostas, M. A. Sherif, L. B ¨uhmann, M. Morsey, S. Auer, and J. Lehmann, “User-driven quality evaluation of dbpedia,” in Proceedings of the 9th International Conference on Semantic Systems, 2013, pp. 97–104
work page 2013
-
[25]
T. Berners-Lee, J. Hendler, and O. Lassila, “The semantic web: A new form of web content that is meaningful to computers will unleash a revolution of new possibilities,” inLinking the world’s information: essays on Tim Berners-Lee’s invention of the World Wide Web, 2023, pp. 91–103
work page 2023
-
[26]
Oops!(ontology pitfall scanner!): An on-line tool for ontology evalua- tion,
M. Poveda-Villal ´on, A. G ´omez-P´erez, and M. C. Su ´arez-Figueroa, “Oops!(ontology pitfall scanner!): An on-line tool for ontology evalua- tion,”International Journal on Semantic Web and Information Systems (IJSWIS), vol. 10, no. 2, pp. 7–34, 2014
work page 2014
-
[27]
An ontology knowledge inspection methodology for quality assessment and continuous improvement,
G. Roldan-Molina, D. Ruano-Ord ´as, V . Basto-Fernandes, and J. R. M´endez, “An ontology knowledge inspection methodology for quality assessment and continuous improvement,”An ontology knowledge in- spection methodology for quality assessment and continuous improve- ment, 2021
work page 2021
-
[28]
Bert: Pre-training of deep bidirectional transformers for language understanding,
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” inPro- ceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186
work page 2019
-
[29]
Depth-first and breadth-first search,
D. C. Kozen, “Depth-first and breadth-first search,” inThe design and analysis of algorithms. Springer, 1992, pp. 19–24
work page 1992
-
[30]
M. Bergman. (2010, August) An executive intro to ontologies. [Online]. Available: https://www.mkbergman.com/900/an-executive-intro-to-ont ologies/
work page 2010
-
[31]
Streamlit: A faster way to build and share data apps,
S. Inc., “Streamlit: A faster way to build and share data apps,” 2023, open-source Python framework. [Online]. Available: https://streamlit.io
work page 2023
-
[32]
P. T. Inc. (2015) Collaborative data science. Montreal, QC. [Online]. Available: https://plot.ly
work page 2015
-
[33]
Plant ontology (po): a controlled vocabulary of plant structures and growth stages,
P. Jaiswal, S. Avraham, K. Ilic, E. A. Kellogg, S. McCouch, A. Pujar, L. Reiser, S. Y . Rhee, M. M. Sachs, M. Schaefferet al., “Plant ontology (po): a controlled vocabulary of plant structures and growth stages,” Comparative and functional genomics, vol. 6, no. 7-8, pp. 388–397, 2005
work page 2005
-
[34]
Gene ontology: tool for the unification of biology,
M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppiget al., “Gene ontology: tool for the unification of biology,”Nature genetics, vol. 25, no. 1, pp. 25–29, 2000
work page 2000
-
[35]
Semanticscience integrated ontology,
“Semanticscience integrated ontology,” https://raw.githubusercontent.co m/micheldumontier/semanticscience/master/ontology/sio/release/sio-rel ease.owl, 2024
work page 2024
-
[36]
D. M. Dooley, E. J. Griffiths, G. S. Gosal, P. L. Buttigieg, R. Hoehndorf, M. C. Lange, L. M. Schriml, F. S. Brinkman, and W. W. Hsiao, “Foodon: a harmonized food ontology to increase global food traceability, quality control and data integration,”npj Science of Food, vol. 2, no. 1, p. 23, 2018
work page 2018
-
[37]
Integrating dublin core metadata for cultural heritage collections using ontologies,
C. Kakali, I. Lourdi, T. Stasinopoulou, L. Bountouri, C. Papatheodorou, M. Doerr, and M. Gergatsoulis, “Integrating dublin core metadata for cultural heritage collections using ontologies,” inProceedings of the International Conference on Dublin Core and Metadata Applications. Dublin Core Metadata Initiative, 2007
work page 2007
-
[38]
Goodrelations: An ontology for describing products and services offers on the web,
M. Hepp, “Goodrelations: An ontology for describing products and services offers on the web,” inInternational conference on knowledge engineering and knowledge management. Springer, 2008, pp. 329–346
work page 2008
-
[39]
Plant ontology — Wikipedia, the free encyclo- pedia,
Wikipedia contributors, “Plant ontology — Wikipedia, the free encyclo- pedia,” https://en.wikipedia.org/wiki/Plant ontology, 2025, accessed: 30 Oct 2025
work page 2025
-
[40]
Ontology documentation – gene ontology,
Gene Ontology Consortium, “Ontology documentation – gene ontology,” https://geneontology.org/docs/ontology-documentation/, 2025, accessed: 1 Nov 2025
work page 2025
-
[41]
About the gene ontology (go): Introduction to go,
——, “About the gene ontology (go): Introduction to go,” https://gene ontology.org/docs/introduction-to-go, 2025, accessed: 1 Nov 2025
work page 2025
-
[42]
Biobert: a pre-trained biomedical language representation model for biomedical text mining,
J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,”Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.