{"paper":{"title":"scShapeBench: Discovering geometry from high dimensional scRNAseq data","license":"http://creativecommons.org/licenses/by/4.0/","headline":"scReebTower extracts Reeb graphs from diffusion geometry to classify single-cell data shapes more accurately than PAGA or Mapper.","cross_cats":["q-bio.GN"],"primary_cat":"cs.LG","authors_text":"Andrew J Steindl, Brian Tshilengi Di Bassinga, C\\'esar Miguel Valdez C\\'ordova, Christine L Chaffer, Daniel Neumann, Dhananjay Bhaskar, Guy Wolf, Ihuan Gunawan, Jo\\~ao Felipe Rocha, John G Lock, Leire Torices, Matthew Scicluna, Shabarni Gupta, Smita Krishnaswamy, Timothy J. Mann, Zachary Warren","submitted_at":"2026-05-12T19:10:38Z","abstract_excerpt":"High-dimensional point cloud data arise across many scientific domains, especially single-cell biology. The shapes or topologies of these datasets determine the types of information that can be extracted. For example, clustered data supports cell-type identification, trajectory structures support transition analysis, and archetypal structures capture continua of cellular behaviors. Existing analysis pipelines often assume a specific shape. The standard Seurat pipeline combines UMAP visualization with Louvain clustering and therefore assumes clustered data, while tools such as Monocle and SPADE"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Our results indicate that scReebTower outperforms existing baselines.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Expert annotations of real single-cell datasets into the four discrete shape categories are accurate, consistent, and sufficient to represent the geometries that matter for downstream analysis.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"scReebTower extracts Reeb graphs from diffusion geometry to classify single-cell data shapes more accurately than PAGA or Mapper.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"5af5fd5b0db9345ecb29fb739d094be206bd84a0325f1f2c841461a42d6675bd"},"source":{"id":"2605.12662","kind":"arxiv","version":1},"verdict":{"id":"0f8cb251-39e1-400a-810c-a22c6860051b","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T21:17:41.406383Z","strongest_claim":"Our results indicate that scReebTower outperforms existing baselines.","one_line_summary":"scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Expert annotations of real single-cell datasets into the four discrete shape categories are accurate, consistent, and sufficient to represent the geometries that matter for downstream analysis.","pith_extraction_headline":"scReebTower extracts Reeb graphs from diffusion geometry to classify single-cell data shapes more accurately than PAGA or Mapper."},"references":{"count":31,"sample":[{"doi":"","year":2015,"title":"Spatial reconstruction of single-cell gene expression data.Nat Biotechnol, 33(5):495–502, April 2015","work_id":"d6f63a84-ba1e-485b-8397-2e799c59b747","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2019,"title":"The single-cell transcriptional landscape of mammalian organogenesis.Nature, 566(7745):496–502, February 2019","work_id":"d4878133-0c08-4140-9ad7-ccf505af790e","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2016,"title":"Visualization and cellular hierarchy inference of single-cell data using spade.Nature protocols, 11(7):1264–1279, 2016","work_id":"ca7c1c0d-51eb-4f76-9fe5-04e55d877399","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2022,"title":"Guillaume Huguet, D. S. Magruder, Alexander Tong, Oluwadamilola Fasina, Manik Kuchroo, Guy Wolf, and Smita Krishnaswamy. Manifold interpolating optimal-transport flows for trajectory inference, 2022","work_id":"73eae370-119d-4649-84d5-718717f550e8","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"Trajecto- rynet: A dynamic optimal transport network for modeling cellular dynamics","work_id":"eeffe8f1-4e0b-4481-ae79-30d46bb794c0","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":31,"snapshot_sha256":"0f274f3baec6a61776a9a0cbb7ebd5a7c6b0438b425c20d2b903ab57aff8d947","internal_anchors":2},"formal_canon":{"evidence_count":2,"snapshot_sha256":"a46fbace7d4f6f922cfabec5b01c25dfbd406b2db964bd9a91e3bd288abedd7b"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}