BioArtlas: Computational Clustering of Multi-Dimensional Complexity in Bioart
Pith reviewed 2026-05-21 21:17 UTC · model grok-4.3
The pith
BioArtlas clusters 81 bioart works across thirteen dimensions to identify four organizational patterns in the field.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The optimal configuration of axis-aware representations, codebook-based grouping, and agglomerative clustering on four-dimensional UMAP space partitions the 81 works into clusters that display artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities.
What carries the argument
Axis-aware representations paired with codebook-based grouping of related concepts, followed by agglomerative clustering on 4D UMAP embeddings.
If this is right
- Bioart can be studied as a multi-dimensional space rather than a single-axis category.
- Artist-level methodological signatures persist across individual projects.
- Technique choices create detectable segments independent of artist identity.
- Works from different decades can share conceptual clusters while showing stylistic drift.
- Quantitative maps can coexist with public-facing interfaces without sacrificing analytical standards.
Where Pith is reading between the lines
- The same representation-and-clustering pipeline could be applied to other hybrid domains such as digital performance or climate art.
- If the four patterns hold under new data, they could serve as benchmarks for tracking how bioart evolves with new technologies.
- Public access to both the dataset and the interactive explorer lowers the barrier for non-computational researchers to test or extend the groupings.
Load-bearing premise
The thirteen chosen dimensions together with the codebook grouping capture the hybrid and polysemous character of bioart works without large curator bias or loss of meaning.
What would settle it
Re-running the full pipeline on the same 81 works after replacing the thirteen dimensions with an independent set of descriptors or removing the codebook step would produce substantially different cluster memberships or eliminate the four reported patterns.
Figures
read the original abstract
Bioart brings living material into artistic practice, where a single work can be at once an aesthetic object, a scientific instrument, and an ethical provocation. Traditional categories sort such works along one axis at a time, which flattens the very hybridity that defines the field and leaves curators no way to compare works across many dimensions together. I introduce BioArtlas, a computational atlas that represents each bioartwork along many curated dimensions at once and organizes the field by conceptual similarity rather than by medium or chronology. My method embeds the keywords of all 81 works on each of thirteen interpretive axes, groups related concepts into a shared codebook that tames inconsistent terminology, and then searches systematically for a clustering that is both statistically clean and interpretable. Among the methods that place every work on the map, agglomerative clustering separates the field far more cleanly than the usual k-means baseline (silhouette 0.664 versus 0.483), whereas density-based methods reach higher scores only by discarding most of the corpus as noise. By separating rigorous analysis from public storytelling, BioArtlas turns the tangled complexity of bioart into a navigable landscape, openly available as an interactive interface (https://www.bioartlas.com) and dataset (https://github.com/joonhyungbae/BioArtlas).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents BioArtlas, a computational framework for clustering 81 bioart works across 13 author-curated dimensions. It employs codebook-based grouping to handle polysemy, evaluates up to 800 combinations of representations and algorithms, identifies Agglomerative clustering with k=15 on 4D UMAP as optimal (silhouette 0.664 ± 0.008, trustworthiness/continuity 0.805/0.812), and interprets the clusters as revealing four organizational patterns: artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities. The work includes a public dataset and interactive web interface.
Significance. If the central claims hold, the paper contributes a reproducible, quantitative method for navigating the multi-dimensional complexity of bioart, with explicit separation of analytical optimization from interpretive communication. The public release of the dataset and interface supports further exploration and cross-disciplinary use in information retrieval and digital humanities.
major comments (2)
- [Results] Results section (cluster interpretation paragraph): The claim that the optimal clustering 'reveals' the four specific organizational patterns is load-bearing for the paper's main contribution, yet the mapping from the 15 clusters to these named patterns appears to rely on post-hoc author judgment. No quantitative validation (e.g., inter-rater agreement, ablation on alternative groupings, or robustness checks against random cluster-to-label assignments) is reported to confirm that these exact patterns emerge reliably rather than reflecting curation choices in the 13 dimensions or codebook.
- [Methods] Methods section (dimension curation and codebook): The 13 dimensions and codebook-based grouping are presented as capturing the hybrid nature of bioart without significant bias, but no sensitivity analysis or comparison to alternative dimension sets is provided. This directly affects whether the four patterns reflect intrinsic structure in the 81 works or author-imposed semantic distinctions.
minor comments (2)
- [Abstract] Abstract and Methods: The phrase 'novel axis-aware representations' is used without a precise definition or equation showing how semantic distinctions are preserved during cross-dimensional comparison.
- [Evaluation] Evaluation paragraph: The ±0.008 on the silhouette score should specify whether this is standard deviation across runs or another measure, and the exact number of runs should be stated for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which highlights important distinctions between our quantitative optimization procedure and the interpretive analysis of the resulting clusters. We address each major comment below, proposing targeted revisions to strengthen the manuscript while preserving the separation of analytical and communicative components.
read point-by-point responses
-
Referee: [Results] Results section (cluster interpretation paragraph): The claim that the optimal clustering 'reveals' the four specific organizational patterns is load-bearing for the paper's main contribution, yet the mapping from the 15 clusters to these named patterns appears to rely on post-hoc author judgment. No quantitative validation (e.g., inter-rater agreement, ablation on alternative groupings, or robustness checks against random cluster-to-label assignments) is reported to confirm that these exact patterns emerge reliably rather than reflecting curation choices in the 13 dimensions or codebook.
Authors: We agree that the four organizational patterns are interpretive observations drawn from inspecting the dimension distributions and representative works within the 15 clusters, rather than outputs of an automated labeling procedure. The load-bearing quantitative contribution remains the exhaustive evaluation of up to 800 representation-algorithm combinations that selected Agglomerative clustering at k=15 on 4D UMAP (silhouette 0.664 ± 0.008). In revision we will augment the Results section with per-cluster quantitative summaries (e.g., mean or mode values for each of the 13 dimensions, plus counts of works per artist or time period) that directly support the named patterns, together with two or three concrete example works per pattern. We will also add an explicit statement that these patterns constitute author-guided interpretation of the optimized grouping and that formal inter-rater or randomization tests would require a separate annotation study, which we flag as future work. This revision clarifies the evidential basis without overstating the quantitative support for the interpretive layer. revision: partial
-
Referee: [Methods] Methods section (dimension curation and codebook): The 13 dimensions and codebook-based grouping are presented as capturing the hybrid nature of bioart without significant bias, but no sensitivity analysis or comparison to alternative dimension sets is provided. This directly affects whether the four patterns reflect intrinsic structure in the 81 works or author-imposed semantic distinctions.
Authors: The 13 dimensions were derived from a systematic review of bioart literature and prior curatorial frameworks to span methodological, technical, temporal, and conceptual axes; the codebook was constructed by grouping polysemous terms observed across the 81 works. We acknowledge that no explicit sensitivity analysis against alternative dimension sets was reported. In the revised Methods section we will (1) provide a table or paragraph justifying each dimension with supporting references from the bioart scholarship, (2) describe the codebook construction process in greater detail, and (3) include a short discussion of limitations together with a limited robustness check: re-running the full 800-combination evaluation on a reduced 10-dimension subset (dropping the two least frequent dimensions) to verify that the top-ranked configuration and the four high-level patterns remain stable. This addition directly addresses the concern about author-imposed structure while remaining within the scope of the existing dataset. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper applies standard unsupervised clustering (Agglomerative at k=15 on 4D UMAP) to author-curated data from 81 works across 13 dimensions, with optimization and evaluation performed using external metrics such as silhouette score (0.664), trustworthiness, and continuity. The four organizational patterns are presented as interpretive labels assigned after clustering rather than as outputs of any equations or derivations that reduce by construction to the input parameters, fitted values, or self-citations. No load-bearing steps match the enumerated circularity patterns; the analysis remains self-contained against the reported benchmarks without renaming known results or smuggling ansatzes via citation.
Axiom & Free-Parameter Ledger
free parameters (2)
- k=15
- 4D UMAP
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Comprehensive evaluation of up to 800 representation-space-algorithm combinations identifies Agglomerative clustering at k=15 on 4D UMAP as optimal (silhouette 0.664 +/- 0.008)
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The approach reveals four organizational patterns: artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.