GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents
Pith reviewed 2026-05-18 07:20 UTC · model grok-4.3
The pith
A multi-agent LLM system routes cellular images to the best segmentation tool on the fly and matches or beats fixed models across diverse benchmarks without training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GenCellAgent orchestrates specialist segmenters and generalist vision-language models via a planner-executor-evaluator loop with long-term memory. The system automatically routes each image to the most suitable tool, adapts tool behavior on the fly with a small number of reference images when imaging conditions differ, supports text-guided segmentation of organelles not covered by existing models, and stores expert edits in memory to enable self-evolution and personalized workflows. Across seven cell-segmentation benchmarks spanning diverse microscopy modalities and totaling 4,718 images, this routing consistently matches or exceeds the best individual tool on every dataset while outperiping
What carries the argument
planner-executor-evaluator loop with long-term memory that routes images to tools, executes them, checks quality, and adapts via references or text prompts
Load-bearing premise
The LLM-based planner can reliably choose the optimal tool and the evaluator can accurately judge segmentation quality without introducing systematic errors or biases across heterogeneous modalities.
What would settle it
A new microscopy modality or organelle type where the agent repeatedly selects a tool that underperforms the best baseline even after supplying reference images and text guidance.
read the original abstract
Cellular image segmentation is essential for quantitative biology yet remains difficult due to heterogeneous modalities, morphological variability, and limited annotations. We present GenCellAgent, a training-free multi-agent framework that orchestrates specialist segmenters and generalist vision-language models via a planner-executor-evaluator loop (choose tool $\rightarrow$ run $\rightarrow$ quality-check) with long-term memory. The system (i) automatically routes images to the best tool, (ii) adapts on the fly using a few reference images when imaging conditions differ from what a tool expects, (iii) supports text-guided segmentation of organelles not covered by existing models, and (iv) commits expert edits to memory, enabling self-evolution and personalized workflows. Across seven cell-segmentation benchmarks spanning diverse microscopy modalities (4,718 images), this routing consistently matches or exceeds the best individual tool on every dataset and outperforms all baselines in overall accuracy. On out-of-distribution organelle data, GenCellAgent substantially outperforms specialist models that were not trained on the target domain, recovering structures that dedicated tools fail to detect. It also segments novel objects such as the Golgi apparatus via iterative text-guided refinement, with light human correction further boosting performance. Together, these capabilities provide a practical path to robust, adaptable cellular image segmentation without retraining, while reducing annotation burden and matching user preferences.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces GenCellAgent, a training-free multi-agent system that uses a planner-executor-evaluator loop with long-term memory to route cellular images to specialist segmentation tools, adapt via few-shot references, perform text-guided refinement for novel structures, and commit edits for self-evolution. It claims that this routing matches or exceeds the best single tool on each of seven benchmarks (4718 images across modalities) and substantially outperforms specialist models on out-of-distribution organelle data by recovering missed structures.
Significance. If the central empirical claims hold after validation of the evaluator component, the work would be significant for demonstrating a practical, annotation-light approach to generalizable segmentation that combines existing tools without retraining and handles modality shifts and novel objects via LLM orchestration. The training-free and self-evolving aspects address real pain points in quantitative biology where new imaging conditions frequently arise.
major comments (3)
- [Methods (planner-executor-evaluator loop description)] The headline result that routing matches or exceeds the best individual tool on all seven benchmarks (4718 images) and recovers structures on OOD organelle data is load-bearing for the paper's contribution, yet the manuscript provides no validation that the VLM-based evaluator produces quality scores that correlate with ground-truth metrics such as IoU or Dice on microscopy images. Low-contrast boundaries and staining artifacts common in these data could lead to systematic mis-scoring, undermining both tool selection and refinement decisions.
- [§4] §4 (Results on benchmarks): the reported outperformance lacks accompanying statistical tests, error bars across multiple runs, or failure-mode analysis, making it impossible to determine whether observed gains are robust or driven by particular image subsets.
- [§4.3 (OOD organelle experiments)] The text-guided refinement for organelles such as the Golgi apparatus is presented as a key capability, but the manuscript does not quantify how many iterations are typically required, the success rate of the VLM evaluator in triggering correct edits, or direct comparisons against baselines on the same OOD dataset.
minor comments (2)
- [Methods] The notation for the long-term memory update rule and the exact prompt templates used for the evaluator are not fully specified, which would aid reproducibility.
- [Figures] Figure captions for the qualitative examples should include the specific tool selected by the planner and the evaluator score for each panel to allow readers to trace the decision process.
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments, which highlight important aspects for strengthening the empirical validation of GenCellAgent. We appreciate the recognition of the work's potential significance for training-free, generalizable segmentation in quantitative biology. Below we respond point-by-point to the major comments, indicating where revisions will be made to address the concerns.
read point-by-point responses
-
Referee: [Methods (planner-executor-evaluator loop description)] The headline result that routing matches or exceeds the best individual tool on all seven benchmarks (4718 images) and recovers structures on OOD organelle data is load-bearing for the paper's contribution, yet the manuscript provides no validation that the VLM-based evaluator produces quality scores that correlate with ground-truth metrics such as IoU or Dice on microscopy images. Low-contrast boundaries and staining artifacts common in these data could lead to systematic mis-scoring, undermining both tool selection and refinement decisions.
Authors: We agree that explicit validation of the VLM evaluator's scores against ground-truth metrics would strengthen the claims. While the end-to-end benchmark results provide indirect support for the evaluator's utility in tool selection and refinement, we acknowledge the risk of mis-scoring due to low-contrast boundaries or artifacts. In the revised manuscript we will add a dedicated analysis (in Methods or Supplementary Information) that computes correlation (Pearson and Spearman) between the evaluator's quality scores and IoU/Dice on a representative subset of images with available ground truth. This will directly address potential systematic biases. revision: yes
-
Referee: [§4] §4 (Results on benchmarks): the reported outperformance lacks accompanying statistical tests, error bars across multiple runs, or failure-mode analysis, making it impossible to determine whether observed gains are robust or driven by particular image subsets.
Authors: We thank the referee for this observation on statistical rigor. The current results report mean performance across the 4,718 images, but we recognize that statistical tests and variability measures would better demonstrate robustness. In revision we will add paired statistical tests (e.g., Wilcoxon signed-rank) comparing GenCellAgent to the best single tool per benchmark, report standard deviations or error bars where multiple LLM runs are feasible, and include a concise failure-mode analysis highlighting image characteristics associated with underperformance. revision: yes
-
Referee: [§4.3 (OOD organelle experiments)] The text-guided refinement for organelles such as the Golgi apparatus is presented as a key capability, but the manuscript does not quantify how many iterations are typically required, the success rate of the VLM evaluator in triggering correct edits, or direct comparisons against baselines on the same OOD dataset.
Authors: We concur that additional quantitative details on the refinement loop would clarify the practical value of this capability. In the revised §4.3 we will report (i) the average and distribution of iterations needed for convergence on the OOD organelle data, (ii) the success rate of the evaluator in correctly triggering edits that improve segmentation (measured against ground truth where available), and (iii) direct side-by-side comparisons with the specialist baselines on the identical OOD test set. These metrics will be presented in the main text or a supplementary table. revision: yes
Circularity Check
No circularity: empirical system evaluation on external benchmarks
full rationale
The paper describes a multi-agent framework for image segmentation and reports its performance via direct empirical comparisons against independent specialist tools and baselines across seven external datasets totaling 4718 images. No mathematical derivation, parameter fitting presented as prediction, or self-referential equations appear in the provided text. All central claims rest on observable routing accuracy, adaptation results, and out-of-distribution performance measured against ground-truth annotations from the benchmarks themselves, with no load-bearing self-citation chains or ansatz smuggling required to support the reported outcomes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Existing specialist segmentation tools can be effectively selected and adapted by LLM agents for heterogeneous cellular images without domain-specific fine-tuning.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
planner–executor–evaluator loop (choose tool → run → quality-check) with long-term memory... style-aware matching... iterative prompt refinement
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanalpha_pin_under_high_calibration unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
test-time scaling... N trials per iteration... evaluator score threshold
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
CellScientist: Dual-Space Hierarchical Orchestration for Closed-Loop Refinement of Virtual Cell Models
CellScientist introduces a dual-space hierarchical orchestration system that enables closed-loop refinement of virtual cell models by routing execution discrepancies back to hypothesis or implementation updates, yield...
-
AblateCell: A Reproduce-then-Ablate Agent for Virtual Cell Repositories
AblateCell reproduces baselines in three single-cell perturbation repositories with 88.9% success and recovers ground-truth critical components with 93.3% accuracy via closed-loop ablation.
Reference graph
Works this paper leans on
-
[1]
In: 2012 Fifth International Symposium on Computational Intelligence and Design, vol
Xie, J., Yu, X., Zheng, X.: Biological cell image segmentation using novel hybrid morphology-based method. In: 2012 Fifth International Symposium on Computational Intelligence and Design, vol. 2, pp. 202–205 (2012). IEEE
work page 2012
-
[2]
Wang, B., Chen, M.: Application research on the analysis of biological detection image segmentation using pde. In: 2015 International Conference on Automation, Mechanical Control and Computational Engineering, pp. 749–753 (2015). Atlantis Press
work page 2015
-
[3]
Humnabadkar, K., Singh, S., Ghosh, D., Bora, P.: Unsupervised active contour model for biological image segmentation and analysis. In: TENCON 2003. Con- ference on Convergent Technologies for Asia-Pacific Region, vol. 2, pp. 538–542 (2003). IEEE
work page 2003
-
[4]
Nature Methods18(1), 100–106 (2021)
Stringer, C., Wang, T., Michaelos, M., Pachitariu, M.: Cellpose: a generalist algorithm for cellular segmentation. Nature Methods18(1), 100–106 (2021)
work page 2021
-
[5]
Nature Methods19(12), 1634–1641 (2022)
Pachitariu, M., Stringer, C.: Cellpose 2.0: how to train your own model. Nature Methods19(12), 1634–1641 (2022)
work page 2022
-
[6]
Stringer, C., Pachitariu, M.: Cellpose3: one-click image restoration for improved cellular segmentation. Nature Methods, 1–8 (2025)
work page 2025
-
[7]
Communications Biology8(1), 962 (2025)
Zhang, X., Lin, Z., Wang, L., Chu, Y.S., Yang, Y., Xiao, X., Lin, Y., Liu, Q.: Swin- cell: a 3d transformer and flow-based framework for improved cell segmentation. Communications Biology8(1), 962 (2025)
work page 2025
-
[8]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023)
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.-Y.,et al.: Segment anything. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2023)
work page 2023
-
[9]
Israel, U., Marks, M., Dilip, R., Li, Q., Yu, C., Laubscher, E., Iqbal, A., Pradhan, E., Ates, A., Abt, M., et al.: Cellsam: a foundation model for cell segmentation. 20 BioRxiv, 2023–11 (2025)
work page 2023
-
[10]
Archit, A., Freckmann, L., Nair, S., Khalid, N., Hilt, P., Rajashekar, V., Freitag, M., Teuber, C., Buckley, G., Haaren, S., et al.: Segment anything for microscopy. Nature Methods, 1–13 (2025)
work page 2025
-
[11]
Nature Communications15(1), 654 (2024)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nature Communications15(1), 654 (2024)
work page 2024
-
[12]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp
Zhao, Y., Bian, H., Mu, M., Uddin, M.R., Li, Z., Li, X., Wang, T., Xu, M.: Cryosam: Training-free cryoet tomogram segmentation with foundation models. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 124–134 (2024). Springer
work page 2024
-
[13]
Jones, D.C., Elz, A.E., Hadadianpour, A., Ryu, H., Glass, D.R., Newell, E.W.: Cell simulation as cell segmentation. Nature Methods (2025)
work page 2025
-
[14]
Lefebvre, A.E., Sturm, G., Lin, T.-Y., Stoops, E., López, M.P., Kaufmann-Malaga, B., Hake, K.: Nellie: automated organelle segmentation, tracking and hierarchical feature extraction in 2d/3d live-cell microscopy. Nature Methods, 1–13 (2025)
work page 2025
-
[15]
Nature Methods20(4), 569–579 (2023)
Lu, M., Christensen, C.N., Weber, J.M., Konno, T., Läubli, N.F., Scherer, K.M., Avezov, E., Lio, P., Lapkin, A.A., Kaminski Schierle, G.S.,et al.: Ernet: a tool for the semantic segmentation and quantitative analysis of endoplasmic reticulum topology. Nature Methods20(4), 569–579 (2023)
work page 2023
-
[16]
Glancy, B.: Mitonet: A generalizable model for segmentation of individual mitochondria within electron microscopy datasets. Cell Systems14(1), 7–8 (2023)
work page 2023
-
[17]
Nature Methods21(8), 1371–1373 (2024)
Royer, L.A.: Omega—harnessing the power of large language models for bioimage analysis. Nature Methods21(8), 1371–1373 (2024)
work page 2024
-
[18]
Microscopy and Microanalysis28(S1), 1576–1577 (2022)
Chiu, C.-L., Clack, N.,et al.: Napari: a python multi-dimensional image viewer platform for the research community. Microscopy and Microanalysis28(S1), 1576–1577 (2022)
work page 2022
-
[19]
io chatbot: a community-driven ai assistant for integrative computational bioimaging
Lei, W., Fuster-Barceló, C., Reder, G., Muñoz-Barrutia, A., Ouyang, W.: Bioim- age. io chatbot: a community-driven ai assistant for integrative computational bioimaging. nature methods21(8), 1368–1370 (2024)
work page 2024
-
[20]
arXiv preprint arXiv:2407.09811 (2024)
Xiao, Y., Liu, J., Zheng, Y., Xie, X., Hao, J., Li, M., Wang, R., Ni, F., Li, Y., Luo, J., et al.: Cellagent: An llm-driven multi-agent framework for automated single-cell data analysis. arXiv preprint arXiv:2407.09811 (2024)
-
[21]
Nature methods18(9), 1038–1045 (2021) 21
Edlund, C., Jackson, T.R., Khalid, N., Bevan, N., Dale, T., Dengel, A., Ahmed, S., Trygg, J., Sjögren, R.: Livecell—a large-scale dataset for label-free live cell segmentation. Nature methods18(9), 1038–1045 (2021) 21
work page 2021
-
[22]
Nature biotechnology40(4), 555–565 (2022)
Greenwald, N.F., Miller, G., Moen, E., Kong, A., Kagel, A., Dougherty, T., Fullaway, C.C., McIntosh, B.J., Leow, K.X., Schwartz, M.S.,et al.: Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nature biotechnology40(4), 555–565 (2022)
work page 2022
-
[23]
Wolny, A., Cerrone, L., Vijayan, A., Tofanelli, R., Barro, A.V., Louveaux, M., Wenzl, C., Strauss, S., Wilson-Sánchez, D., Lymbouridou, R.,et al.: Accurate and versatile 3d segmentation of plant tissues at cellular resolution. Elife9, 57613 (2020)
work page 2020
-
[24]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp
Graham, S., Jahanifar, M., Azam, A., Nimir, M., Tsang, Y.-W., Dodd, K., Hero, E., Sahota, H., Tank, A., Benes, K.,et al.: Lizard: a large-scale dataset for colonic nuclear instance segmentation and classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 684–693 (2021)
work page 2021
-
[25]
International journal of computer vision88(2), 303–338 (2010)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International journal of computer vision88(2), 303–338 (2010)
work page 2010
-
[26]
Janelia Research Campus (2024)
Team, C.P., Ackerman, D., Ahrens, M.B., Aso, Y., Avetissian, E., Bennett, D., et al.: CellMap 2024 Segmentation Challenge. Janelia Research Campus (2024). https://doi.org/10.25378/janelia.c.7456966
-
[27]
Seggpt: Segmenting everything in context,
Wang, X., Zhang, X., Cao, Y., Wang, W., Shen, C., Huang, T.: Seggpt: Segmenting everything in context. arXiv preprint arXiv:2304.03284 (2023)
-
[28]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Lai, X., Tian, Z., Chen, Y., Li, Y., Yuan, Y., Liu, S., Jia, J.: Lisa: Reasoning seg- mentation via large language model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9579–9589 (2024)
work page 2024
-
[29]
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Luo, J., Zhang, W., Yuan, Y., Zhao, Y., Yang, J., Gu, Y., Wu, B., Chen, B., Qiao, Z., Long, Q., et al.: Large language model agent: A survey on methodology, applications and challenges. arXiv preprint arXiv:2503.21460 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[30]
ACM Transactions on Information Systems (2024)
Zhang, Z., Dai, Q., Bo, X., Ma, C., Li, R., Chen, X., Zhu, J., Dong, Z., Wen, J.-R.: A survey on the memory mechanism of large language model based agents. ACM Transactions on Information Systems (2024)
work page 2024
-
[31]
Understanding the planning of LLM agents: A survey
Huang, X., Liu, W., Chen, X., Wang, X., Wang, H., Lian, D., Wang, Y., Tang, R., Chen, E.: Understanding the planning of llm agents: A survey, 2024. URL https://arxiv. org/abs/2402.02716
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[32]
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Guo, T., Chen, X., Wang, Y., Chang, R., Pei, S., Chawla, N.V., Wiest, O., Zhang, X.: Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680 (2024) 22
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Ifargan, T., Hafner, L., Kern, M., Alcalay, O., Kishony, R.: Autonomous llm- driven research—from data to human-verifiable research papers. NEJM AI2(1), 2400555 (2025)
work page 2025
-
[34]
arXiv preprint arXiv:2505.13259 , year =
Zheng, T., Deng, Z., Tsang, H.T., Wang, W., Bai, J., Wang, Z., Song, Y.: From automation to autonomy: A survey on large language models in scientific discovery. arXiv preprint arXiv:2505.13259 (2025)
-
[35]
arXiv preprint arXiv:2502.06111 (2025)
Xiao, Y., Wang, R., Kong, L., Golac, D., Wang, W.: Csr-bench: Benchmarking llm agents in deployment of computer science research repositories. arXiv preprint arXiv:2502.06111 (2025)
-
[36]
Gridach, M., Nanavati, J., Abidine, K.Z.E., Mendes, L., Mack, C.: Agentic ai for scientific discovery: A survey of progress, challenges, and future directions. arXiv preprint arXiv:2503.08979 (2025)
-
[37]
Journal of the American Chemical Society147(15), 12534–12545 (2025)
Song, T., Luo, M., Zhang, X., Chen, L., Huang, Y., Cao, J., Zhu, Q., Liu, D., Zhang, B., Zou, G.,et al.: A multiagent-driven robotic ai chemist enabling autonomous chemical research on demand. Journal of the American Chemical Society147(15), 12534–12545 (2025)
work page 2025
-
[38]
Agent Laboratory: Using LLM Agents as Research Assistants
Schmidgall, S., Su, Y., Wang, Z., Sun, X., Wu, J., Yu, X., Liu, J., Moor, M., Liu, Z., Barsoum, E.: Agent laboratory: Using llm agents as research assistants. arXiv preprint arXiv:2501.04227 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[39]
Advanced Materials37(22), 2413523 (2025)
Ghafarollahi, A., Buehler, M.J.: Sciagents: automating scientific discovery through bioinspired multi-agent intelligent graph reasoning. Advanced Materials37(22), 2413523 (2025)
work page 2025
-
[40]
arXiv preprint arXiv:2409.00054 (2024)
Hu, Y., Liu, D., Wang, Q., Yu, C., Xu, C., Zheng, Q., Ji, H., Xiong, J.: Automating intervention discovery from scientific literature: A progressive ontology prompting and dual-llm framework. arXiv preprint arXiv:2409.00054 (2024)
-
[41]
A vision for auto research with llm agents.arXiv preprint arXiv:2504.18765, 2025
Liu, C., Wang, C., Cao, J., Ge, J., Wang, K., Zhang, L., Cheng, M.-M., Zhao, P., Li, T., Jia, X., et al.: A vision for auto research with llm agents. arXiv preprint arXiv:2504.18765 (2025)
-
[42]
Nature Biomedical Engineering, 1–14 (2025)
Qu, Y., Huang, K., Yin, M., Zhan, K., Liu, D., Yin, D., Cousins, H.C., Johnson, W.A., Wang, X., Shah, M., et al.: Crispr-gpt for agentic automation of gene-editing experiments. Nature Biomedical Engineering, 1–14 (2025)
work page 2025
-
[43]
Huang, K., Zhang, S., Wang, H., Qu, Y., Lu, Y., Roohani, Y., Li, R., Qiu, L., Li, G., Zhang, J., et al.: Biomni: A general-purpose biomedical ai agent. biorxiv, 2025–05 (2025)
work page 2025
-
[44]
In: Proceedings of the IEEE Conference on Computer Vision 23 and Pattern Recognition, pp
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision 23 and Pattern Recognition, pp. 2414–2423 (2016)
work page 2016
-
[45]
Very Deep Convolutional Networks for Large-Scale Image Recognition
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[46]
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25(2012) 24 Appendix A Tools Repository The repository integrates multiple categories of tools. The primary LLMused is Gemini-2.0-Flash, while the evaluation agent employs Gemini-2.5-Flash-Preview-...
work page 2012
-
[47]
Analyze the query, previous reasoning steps, and observations. 26
-
[48]
Decide on the next action: use a tool or provide a final answer
-
[49]
Respond in the following JSON format: If you need to use a tool: {{ "thought": "Your detailed reasoning about what to do next", "action": {{ "name": "Tool name (google, imagesegmentation, oneshotsegmentation, segmentationevaluation)", "reason": "Explanation of why you chose this tool", "input": "Specific input for the tool, if different from the original ...
-
[50]
**Reference Image 1** A poor segmentation mask (score = 0)
-
[51]
**Reference Image 2** Another poor segmentation mask (score = 0)
-
[52]
33 Use the first two images as context examples to understand what poor segmentation looks like
**Evaluation Image** A new segmentation mask that needs to be evaluated. 33 Use the first two images as context examples to understand what poor segmentation looks like. Then evaluate the third image according to the criteria below. --- ### Evaluation Criteria (0100 scoring scale, with weights):
-
[53]
**Stacked Morphology** (Weight: 0.35) Assess how well the membrane layers are organized and stacked in the segmentation
-
[54]
**Cisternae Definition** (Weight: 0.25) Evaluate the clarity, separation, and recognizable structure of cisternae in the segmentation
-
[55]
**Overall Cohesion** (Weight: 0.2) Does the segmentation appear connected, logical, and anatomically plausible as a whole?
-
[56]
**Segmentation Cleanliness** (Weight: 0.2) Check for artifacts, stray regions, or noise that detracts from the clarity of the segmentation. --- ### Reference Image 1 (Score = 0) This segmentation mask performs poorly across all evaluation criteria, as it incorrectly labels the entire image area as segmented, without distinguishing relevant structures from...
-
[57]
Describe the segmentation process used in the current run
-
[58]
List the tools used in order and what each contributed
-
[59]
one-shot segmentation) - Feedback frequency - Number of iterations
Summarize the users interaction behavior: - Use of automatic vs manual tools - Use of references (e.g. one-shot segmentation) - Feedback frequency - Number of iterations
-
[60]
Based on this run alone, recommend: [CURRENT RUN] Recommended HITL Mode: <Fully Automatic | Reference Guided | Human Interaction> Reason: <why this HITL mode fits this specific run> --- ## PART 2: Long-Term User Profile and Final Recommendation
-
[61]
Review the historical HITL recommendations and detect **behavioral trends**: - Is the user becoming more or less interactive over time? - Are they consistently using the same tools or exploring new ones? - Are they gradually shifting from automation to correction (or vice versa)?
-
[62]
Consistently prefers fully automated workflows with minimal feedback
Generate a long-term **User Profile** considering both the current and past sessions. Example profiles: - "Consistently prefers fully automated workflows with minimal feedback." - "Has evolved from reference-based guidance to more manual correction." - "Initially used correction tools but now prefers faster automatic approaches."
-
[63]
The user increasingly engages with manual tools
Provide the final recommendation: [OVERALL RECOMMENDATION] Recommended HITL Mode: <Fully Automatic | Reference Guided | Human Interaction> User Profile: <summary across runs that includes progression or consistency> Reason: <why this mode is appropriate based on the pattern across sessions> -- ## Guidance: 37 - If the tool`oneshotsegmentation`was used in ...
-
[64]
Summarize the Visual Characteristics described across the search content
-
[65]
Help me segment the mitochondrion in the provided image. Please use MitoNet
Generate a Segmentation Prompt that could be used to guide a visual segmentation tool based on those characteristics. {search content} Output format:\n" ### Visual Characteristics Summary ### [your summary here] ### Segmentation Prompt ### [your segmentation prompt here] Listing 6: Search Summarize Prompt Appendix D More Details for Human Interactions The...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.