pith. sign in

arxiv: 2511.19162 · v2 · pith:RU23GPTMnew · submitted 2025-09-27 · 💻 cs.IR · cs.CY· cs.HC· cs.LG· cs.MM

BioArtlas: Computational Clustering of Multi-Dimensional Complexity in Bioart

Pith reviewed 2026-05-21 21:17 UTC · model grok-4.3

classification 💻 cs.IR cs.CYcs.HCcs.LGcs.MM
keywords bioartclusteringmulti-dimensional analysiscomputational humanitiesUMAPagglomerative clusteringart classificationpolysemy handling
0
0 comments X

The pith

BioArtlas clusters 81 bioart works across thirteen dimensions to identify four organizational patterns in the field.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Bioart mixes art, science, ethics and politics in ways that single categories cannot capture. The paper builds BioArtlas to represent each of 81 works along thirteen curated dimensions while using codebook grouping to handle overlapping terms. It tests hundreds of representation and algorithm combinations and settles on agglomerative clustering after four-dimensional reduction. This produces four repeatable patterns: works by the same artist stay close, techniques form clear groups, styles shift over time, and some concepts link works from different periods. The system keeps the quantitative analysis separate from a public web explorer so both rigor and access are possible.

Core claim

The optimal configuration of axis-aware representations, codebook-based grouping, and agglomerative clustering on four-dimensional UMAP space partitions the 81 works into clusters that display artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities.

What carries the argument

Axis-aware representations paired with codebook-based grouping of related concepts, followed by agglomerative clustering on 4D UMAP embeddings.

If this is right

  • Bioart can be studied as a multi-dimensional space rather than a single-axis category.
  • Artist-level methodological signatures persist across individual projects.
  • Technique choices create detectable segments independent of artist identity.
  • Works from different decades can share conceptual clusters while showing stylistic drift.
  • Quantitative maps can coexist with public-facing interfaces without sacrificing analytical standards.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same representation-and-clustering pipeline could be applied to other hybrid domains such as digital performance or climate art.
  • If the four patterns hold under new data, they could serve as benchmarks for tracking how bioart evolves with new technologies.
  • Public access to both the dataset and the interactive explorer lowers the barrier for non-computational researchers to test or extend the groupings.

Load-bearing premise

The thirteen chosen dimensions together with the codebook grouping capture the hybrid and polysemous character of bioart works without large curator bias or loss of meaning.

What would settle it

Re-running the full pipeline on the same 81 works after replacing the thirteen dimensions with an independent set of descriptors or removing the codebook step would produce substantially different cluster memberships or eliminate the four reported patterns.

Figures

Figures reproduced from arXiv: 2511.19162 by Joonhyung Bae.

Figure 1
Figure 1. Figure 1: BioArtlas interactive visualization interface. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

Bioart's hybrid nature spanning art, science, technology, ethics, and politics defies traditional single-axis categorization. I present BioArtlas, analyzing 81 bioart works across thirteen curated dimensions using novel axis-aware representations that preserve semantic distinctions while enabling cross-dimensional comparison. Our codebook-based approach groups related concepts into unified clusters, addressing polysemy in cultural terminology. Comprehensive evaluation of up to 800 representation-space-algorithm combinations identifies Agglomerative clustering at k=15 on 4D UMAP as optimal (silhouette 0.664 +/- 0.008, trustworthiness/continuity 0.805/0.812). The approach reveals four organizational patterns: artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities. By separating analytical optimization from public communication, I provide rigorous analysis and accessible exploration through an interactive web interface (https://www.bioartlas.com) with the dataset publicly available (https://github.com/joonhyungbae/BioArtlas).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents BioArtlas, a computational framework for clustering 81 bioart works across 13 author-curated dimensions. It employs codebook-based grouping to handle polysemy, evaluates up to 800 combinations of representations and algorithms, identifies Agglomerative clustering with k=15 on 4D UMAP as optimal (silhouette 0.664 ± 0.008, trustworthiness/continuity 0.805/0.812), and interprets the clusters as revealing four organizational patterns: artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities. The work includes a public dataset and interactive web interface.

Significance. If the central claims hold, the paper contributes a reproducible, quantitative method for navigating the multi-dimensional complexity of bioart, with explicit separation of analytical optimization from interpretive communication. The public release of the dataset and interface supports further exploration and cross-disciplinary use in information retrieval and digital humanities.

major comments (2)
  1. [Results] Results section (cluster interpretation paragraph): The claim that the optimal clustering 'reveals' the four specific organizational patterns is load-bearing for the paper's main contribution, yet the mapping from the 15 clusters to these named patterns appears to rely on post-hoc author judgment. No quantitative validation (e.g., inter-rater agreement, ablation on alternative groupings, or robustness checks against random cluster-to-label assignments) is reported to confirm that these exact patterns emerge reliably rather than reflecting curation choices in the 13 dimensions or codebook.
  2. [Methods] Methods section (dimension curation and codebook): The 13 dimensions and codebook-based grouping are presented as capturing the hybrid nature of bioart without significant bias, but no sensitivity analysis or comparison to alternative dimension sets is provided. This directly affects whether the four patterns reflect intrinsic structure in the 81 works or author-imposed semantic distinctions.
minor comments (2)
  1. [Abstract] Abstract and Methods: The phrase 'novel axis-aware representations' is used without a precise definition or equation showing how semantic distinctions are preserved during cross-dimensional comparison.
  2. [Evaluation] Evaluation paragraph: The ±0.008 on the silhouette score should specify whether this is standard deviation across runs or another measure, and the exact number of runs should be stated for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important distinctions between our quantitative optimization procedure and the interpretive analysis of the resulting clusters. We address each major comment below, proposing targeted revisions to strengthen the manuscript while preserving the separation of analytical and communicative components.

read point-by-point responses
  1. Referee: [Results] Results section (cluster interpretation paragraph): The claim that the optimal clustering 'reveals' the four specific organizational patterns is load-bearing for the paper's main contribution, yet the mapping from the 15 clusters to these named patterns appears to rely on post-hoc author judgment. No quantitative validation (e.g., inter-rater agreement, ablation on alternative groupings, or robustness checks against random cluster-to-label assignments) is reported to confirm that these exact patterns emerge reliably rather than reflecting curation choices in the 13 dimensions or codebook.

    Authors: We agree that the four organizational patterns are interpretive observations drawn from inspecting the dimension distributions and representative works within the 15 clusters, rather than outputs of an automated labeling procedure. The load-bearing quantitative contribution remains the exhaustive evaluation of up to 800 representation-algorithm combinations that selected Agglomerative clustering at k=15 on 4D UMAP (silhouette 0.664 ± 0.008). In revision we will augment the Results section with per-cluster quantitative summaries (e.g., mean or mode values for each of the 13 dimensions, plus counts of works per artist or time period) that directly support the named patterns, together with two or three concrete example works per pattern. We will also add an explicit statement that these patterns constitute author-guided interpretation of the optimized grouping and that formal inter-rater or randomization tests would require a separate annotation study, which we flag as future work. This revision clarifies the evidential basis without overstating the quantitative support for the interpretive layer. revision: partial

  2. Referee: [Methods] Methods section (dimension curation and codebook): The 13 dimensions and codebook-based grouping are presented as capturing the hybrid nature of bioart without significant bias, but no sensitivity analysis or comparison to alternative dimension sets is provided. This directly affects whether the four patterns reflect intrinsic structure in the 81 works or author-imposed semantic distinctions.

    Authors: The 13 dimensions were derived from a systematic review of bioart literature and prior curatorial frameworks to span methodological, technical, temporal, and conceptual axes; the codebook was constructed by grouping polysemous terms observed across the 81 works. We acknowledge that no explicit sensitivity analysis against alternative dimension sets was reported. In the revised Methods section we will (1) provide a table or paragraph justifying each dimension with supporting references from the bioart scholarship, (2) describe the codebook construction process in greater detail, and (3) include a short discussion of limitations together with a limited robustness check: re-running the full 800-combination evaluation on a reduced 10-dimension subset (dropping the two least frequent dimensions) to verify that the top-ranked configuration and the four high-level patterns remain stable. This addition directly addresses the concern about author-imposed structure while remaining within the scope of the existing dataset. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper applies standard unsupervised clustering (Agglomerative at k=15 on 4D UMAP) to author-curated data from 81 works across 13 dimensions, with optimization and evaluation performed using external metrics such as silhouette score (0.664), trustworthiness, and continuity. The four organizational patterns are presented as interpretive labels assigned after clustering rather than as outputs of any equations or derivations that reduce by construction to the input parameters, fitted values, or self-citations. No load-bearing steps match the enumerated circularity patterns; the analysis remains self-contained against the reported benchmarks without renaming known results or smuggling ansatzes via citation.

Axiom & Free-Parameter Ledger

2 free parameters · 0 axioms · 0 invented entities

The central claim rests on the validity of the 13 hand-curated dimensions and the assumption that clustering outputs correspond to meaningful artistic patterns rather than artifacts of representation choices.

free parameters (2)
  • k=15
    Number of clusters selected as optimal after evaluating multiple values
  • 4D UMAP
    Dimensionality reduction target chosen after testing combinations

pith-pipeline@v0.9.0 · 5710 in / 1212 out tokens · 30980 ms · 2026-05-21T21:17:51.413949+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    Transgenic art.Leonardo Electronic Almanac, 6(11):289–296, 1998

    Eduardo Kac. Transgenic art.Leonardo Electronic Almanac, 6(11):289–296, 1998

  2. [2]

    The ethics of experiential engagement with the manipulation of life

    Oron Catts and Ionat Zurr. The ethics of experiential engagement with the manipulation of life. InTactical Biopolitics-Art, Activism, and Technoscience, pages 125–142. MIT Press, 2008

  3. [3]

    University of Michigan Press, 2005

    Eduardo Kac.Telepresence & bio art: networking humans, rabbits, & robots. University of Michigan Press, 2005

  4. [4]

    MIT Press, Cambridge, MA, 2005

    Marquard Smith, editor.Stelarc: The Monograph. MIT Press, Cambridge, MA, 2005. ISBN 978-0262693608. First comprehensive study of Stelarc’s work practice

  5. [5]

    Bio art - taxonomy of an etymological monster

    Jens Hauser. Bio art - taxonomy of an etymological monster. InHybrid: Living in Paradox, pages 182–193. 2005

  6. [6]

    Biomedia: The age of media with life-like behavior

    ZKM | Center for Art and Media Karlsruhe. Biomedia: The age of media with life-like behavior. https://zkm.de/en/exhibition/2021/12/biomedia, 2021–2022. Exhibition, Accessed 2025-08-09

  7. [7]

    Scale-Free Networks: Complex Webs in Nature and Technology

    Bruno Latour.Reassembling the Social: An Introduction to Actor-Network-Theory. Oxford University Press, 07 2005. ISBN 9780199256044. doi: 10.1093/oso/9780199256044.001.0001. URLhttps://doi.org/10.1093/oso/9780199256044.001.0001

  8. [8]

    When species meet: Staying with the trouble.Environment and Planning D: Society and Space, 28(1):53–55, 2010

    Donna Haraway. When species meet: Staying with the trouble.Environment and Planning D: Society and Space, 28(1):53–55, 2010

  9. [9]

    Expert knowledge integration in historical record analysis.Journal of Digital Humanities, 2022

    Jeroen Baas et al. Expert knowledge integration in historical record analysis.Journal of Digital Humanities, 2022

  10. [10]

    Toward cultural interpretability: A linguistic anthropological framework for describing and evaluating large language models.Big Data & Society, 12(1):20539517241303118, 2025

    Graham M Jones, Shai Satran, and Arvind Satyanarayan. Toward cultural interpretability: A linguistic anthropological framework for describing and evaluating large language models.Big Data & Society, 12(1):20539517241303118, 2025

  11. [11]

    more media

    Lev Manovich. Cultural analytics: Visualizing cultural patterns in the era of “more media”. Domus March, 2009

  12. [12]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese bert- networks.arXiv preprint arXiv:1908.10084, 2019

  13. [13]

    Artificial life & intelligence category (prix ars electronica)

    Ars Electronica. Artificial life & intelligence category (prix ars electronica). https:// ars.electronica.art/prix/en/categories/artificial-life-intelligence/. Ac- cessed 2025-08-09

  14. [14]

    Bad award.https://www.badaward.nl/, 2011–2024

    Bio Art & Design Award. Bad award.https://www.badaward.nl/, 2011–2024. Competition discontinued in 2025, Accessed 2025-08-09

  15. [15]

    Symbionts: Contemporary artists and the biosphere

    MIT List Visual Arts Center. Symbionts: Contemporary artists and the biosphere. https: //listart.mit.edu/exhibitions/symbionts-contemporary-artists-biosphere , 2022–2023. Exhibition, Accessed 2025-08-09

  16. [16]

    Grand prize for innovation at the nexus of science, technology, and the arts

    S+T+ARTS Prize. Grand prize for innovation at the nexus of science, technology, and the arts. https://starts-prize.aec.at/en/. Accessed 2025-08-09

  17. [17]

    Isea symposium archives

    ISEA International. Isea symposium archives. https://www.isea-archives.org/. Ac- cessed 2025-08-09

  18. [18]

    Estimating the number of clusters in a data set via the gap statistic.Journal of the royal statistical society: series b (statistical methodology), 63(2):411–423, 2001

    Robert Tibshirani, Guenther Walther, and Trevor Hastie. Estimating the number of clusters in a data set via the gap statistic.Journal of the royal statistical society: series b (statistical methodology), 63(2):411–423, 2001

  19. [19]

    On clustering validation techniques

    Maria Halkidi, Yannis Batistakis, and Michalis Vazirgiannis. On clustering validation techniques. Journal of intelligent information systems, 17(2):107–145, 2001

  20. [20]

    An examination of procedures for determining the number of clusters in a data set.Psychometrika, 50(2):159–179, 1985

    Glenn W Milligan and Martha C Cooper. An examination of procedures for determining the number of clusters in a data set.Psychometrika, 50(2):159–179, 1985. 7

  21. [21]

    John Wiley & Sons, 1990

    Leonard Kaufman and Peter J Rousseeuw.Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, 1990. 8