Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents

Haipeng Zhang; Haoyu Pei; Kunpeng Zhang; Suting Hong; Xiangyi Xiao; Xiaocong Du; Yihui Li; Zhongyang Liu

arxiv: 2512.22608 · v3 · submitted 2025-12-27 · 💻 cs.AI · cs.CE

Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents

Zhongyang Liu , Haoyu Pei , Xiangyi Xiao , Xiaocong Du , Yihui Li , Suting Hong , Kunpeng Zhang , Haipeng Zhang This is my paper

Pith reviewed 2026-05-16 19:00 UTC · model grok-4.3

classification 💻 cs.AI cs.CE

keywords startup success predictioncollective agentsventure capitalmulti-agent systemsgraph neural networksinvestor networksrole-playing agents

0 comments

The pith

SimVC-CAS models VC decisions as interactions among role-playing agents to predict startup success more accurately than isolated approaches.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Traditional startup success models treat each investor decision in isolation and miss the group dynamics that shape real venture capital funding. SimVC-CAS instead builds a system of agents, each assigned distinct investor traits and preferences, that exchange information through a graph neural network built on observed co-investment links. This turns the prediction task into a collective process that combines company fundamentals with network effects. On both proprietary and public data the model delivers roughly 25 percent relative gains in average precision at 10 and produces outputs that track actual investor choices, with the largest lift appearing for startups at the center of the investor graph. The design also lets researchers trace how network context alters individual agent reasoning.

Core claim

By designing role-playing agents with distinct traits and a GNN-based supervised interaction module, SimVC-CAS reformulates startup financing prediction as a multi-agent group decision-making process over a graph-structured co-investment network, achieving approximately 25 percent relative improvement in average precision@10 while remaining consistent with real investor decisions.

What carries the argument

SimVC-CAS collective agent system that uses role-playing agents to represent heterogeneous investor traits and a GNN-based interaction module to simulate information exchange on the co-investment network.

If this is right

The interaction mechanism delivers the largest accuracy lift for network-central startups.
Predictions remain aligned with observed investor decisions rather than diverging from them.
Examining how agents revise their reasoning after interactions reveals network influence on decision quality.
The overall approach may extend to other group decision-making settings that involve information exchange among heterogeneous actors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Single-decision-maker models may systematically underperform in any domain where decisions are shaped by repeated co-participant networks.
Direct comparison of simulated agent rationales against recorded investor meeting notes could test whether the role-play traits match real preference distributions.
The same interaction structure could be applied to forecast outcomes in board votes or syndication rounds by swapping the underlying graph.
Performance would likely drop if the model were retrained on data that lacks co-investment edges, confirming the network component's contribution.

Load-bearing premise

The role-playing agents and GNN interaction module accurately reflect real heterogeneous VC preferences and information-exchange dynamics without introducing artifacts from the simulation design or data choices.

What would settle it

Measure whether the performance gain disappears on a new dataset when the GNN interaction module is removed or when network links are replaced with random edges while keeping all other inputs fixed.

read the original abstract

Due to the high value and high failure rates of startups, predicting their success is a critical challenge. Existing approaches typically model startup success from a single decision-maker's perspective, overlooking the collective dynamics that dominate real-world venture capital (VC) decision-making. We propose SimVC-CAS, a collective agent system that simulates VC decisions as a multi-agent interaction process. By designing role-playing agents and a GNN-based supervised interaction module, we reformulate startup financing prediction as a group decision-making task, capturing both enterprise fundamentals and investor network dynamics. Each agent represents an investor with distinct traits and preferences, enabling heterogeneous evaluations and realistic information exchange over a graph-structured co-investment network. Using both proprietary and public VC data with strict anti-leakage controls, we show that SimVC-CAS significantly improves predictive performance, achieving approximately 25% relative improvement in average precision@10, while exhibiting consistency with real investor decisions. The interaction mechanism is particularly effective for network-central startups, confirming the importance of network in VC decision-making. Analysis of agents' reasoning for decision changes further reveals how network environment influence decision quality, demonstrating the system's interpretability. Our approach may generalize to broader group decision-making scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The multi-agent role-play setup with GNN interaction is the actual new piece, but the 25% AP@10 claim rests on thin experimental detail and unverified simulation fidelity.

read the letter

The paper reframes startup success prediction as a group decision process among role-playing investor agents linked by a co-investment graph, with a supervised GNN managing their interactions. That collective framing is the clear departure from the single-decision-maker baselines that have been standard in this area. They report a roughly 25% relative lift in AP@10 on a mix of proprietary and public VC data, plus some alignment with actual investor choices and better results on network-central startups. The analysis of how agents revise decisions based on network signals also gives a bit of interpretability that single models usually lack. Using anti-leakage splits is the right move for this kind of data. The soft spots are straightforward. The abstract gives no baseline list, no ablation numbers on the agent traits or GNN module, and no statistical tests, so it is impossible to tell whether the lift is driven by the collective mechanism or simply by extra capacity. Proprietary data plus no external anchor (such as expert review of the simulated rationales) leaves the fidelity question open: the role-play could be introducing artifacts rather than reproducing real heterogeneous preferences and information flow. This is aimed at people working on multi-agent systems or graph models for economic prediction tasks. A reader already interested in VC network dynamics or collective decision modeling would get value from the interaction design. I would send it to peer review. The core idea is coherent and the empirical direction is worth checking, even if the current write-up needs tighter controls and more transparent experiments to stand up.

Referee Report

4 major / 2 minor

Summary. The paper introduces SimVC-CAS, a collective agent system using role-playing agents with distinct traits and a GNN-based supervised interaction module over a co-investment network to model VC group decision-making for startup success prediction. It reports an approximately 25% relative improvement in average precision@10 over baselines, consistency with real investor decisions, and stronger performance on network-central startups, based on proprietary and public VC data with anti-leakage controls; the work also includes interpretability analysis of agent reasoning.

Significance. If the simulation faithfully reproduces heterogeneous VC preferences and information exchange without artifacts, the approach could meaningfully advance predictive modeling in venture capital by incorporating collective dynamics and network effects beyond isolated-investor baselines. The use of both proprietary and public data plus interpretability elements are strengths, but the absence of external validation anchors limits the result's immediate generalizability and impact.

major comments (4)

[Abstract and §4] Abstract and §4 (Experiments): the reported ~25% relative AP@10 improvement is stated without naming the baseline models, providing statistical significance tests, ablation results, or error analysis; this prevents assessment of whether the lift arises from collective dynamics or from added model capacity.
[§3] §3 (Methods): the design of role-playing agents with distinct traits and the GNN interaction module lacks any external anchor such as held-out real decision logs or human-expert ratings of agent rationales, leaving open the possibility that performance gains reflect simulation artifacts rather than genuine VC dynamics.
[§4] §4 (Results): the claim that the interaction mechanism is 'particularly effective for network-central startups' is not supported by a quantitative breakdown (e.g., performance stratified by degree or centrality metrics) or comparison against non-network baselines.
[§4] §4 and data description: reliance on proprietary data without public release or sufficient detail on construction and anti-leakage controls precludes independent verification of the held-out evaluation and the reported consistency with real decisions.

minor comments (2)

[Abstract] Abstract: define the acronym SimVC-CAS at first mention.
[§3] §3: provide the full list of GNN and agent hyperparameters and training details to support reproducibility.

Simulated Author's Rebuttal

4 responses · 1 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where revisions to the manuscript are planned or have been incorporated.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): the reported ~25% relative AP@10 improvement is stated without naming the baseline models, providing statistical significance tests, ablation results, or error analysis; this prevents assessment of whether the lift arises from collective dynamics or from added model capacity.

Authors: We thank the referee for this observation. In the revised manuscript, the abstract and §4 now explicitly name all baseline models (including isolated-investor GNNs, standard ML classifiers, and non-collective agent variants). We have added statistical significance tests (paired t-tests on AP@10 across folds), comprehensive ablation studies isolating the role-playing agents and GNN interaction module, and error analysis showing that gains are driven by collective dynamics rather than parameter count. These revisions clarify the source of the reported improvement. revision: yes
Referee: [§3] §3 (Methods): the design of role-playing agents with distinct traits and the GNN interaction module lacks any external anchor such as held-out real decision logs or human-expert ratings of agent rationales, leaving open the possibility that performance gains reflect simulation artifacts rather than genuine VC dynamics.

Authors: We acknowledge the desirability of stronger external anchors. Our evaluation already uses held-out test sets from both proprietary and public data with documented anti-leakage controls, plus quantitative consistency metrics against real co-investment outcomes. However, we do not possess additional held-out real decision logs or new human-expert ratings of rationales. In the revision we expand the methods section with explicit references to VC literature used to define agent traits and add further qualitative examples of rationale changes. We list the lack of direct human ratings as a limitation. revision: partial
Referee: [§4] §4 (Results): the claim that the interaction mechanism is 'particularly effective for network-central startups' is not supported by a quantitative breakdown (e.g., performance stratified by degree or centrality metrics) or comparison against non-network baselines.

Authors: We agree that quantitative support is needed. The revised §4 now includes performance tables stratified by degree centrality, betweenness, and eigenvector centrality, demonstrating higher relative gains for network-central startups. We also add direct comparisons against non-network baselines (isolated-agent and non-GNN variants) to isolate the contribution of the interaction module. revision: yes
Referee: [§4] §4 and data description: reliance on proprietary data without public release or sufficient detail on construction and anti-leakage controls precludes independent verification of the held-out evaluation and the reported consistency with real decisions.

Authors: We recognize this as a genuine reproducibility limitation. The proprietary VC dataset cannot be released due to confidentiality agreements. We have already provided detailed descriptions of data construction, feature sets, temporal splits, and anti-leakage protocols; the revision adds further specifics and supplementary tables. Results on a fully public VC dataset are included to permit partial verification, and code will be released upon acceptance. revision: no

standing simulated objections not resolved

Reliance on proprietary data that cannot be publicly released due to confidentiality, limiting full independent verification of results on that dataset.

Circularity Check

0 steps flagged

No circularity: empirical gains on held-out data are not forced by construction

full rationale

The paper reports an empirical performance lift (approximately 25% relative AP@10) obtained by training and evaluating SimVC-CAS on held-out proprietary and public VC data under explicit anti-leakage controls. The central claim is framed as an observed outcome of the multi-agent simulation rather than an algebraic identity or a quantity that reduces to the definition of the fitted parameters. No equations equate the reported prediction metric to a re-expression of the input features or the GNN supervision signal, and no load-bearing step invokes a self-citation whose validity is presupposed by the present work. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on unverified assumptions that simulated agents match real investor heterogeneity and that the co-investment graph plus GNN captures the dominant decision influences; no independent evidence for agent fidelity is supplied beyond the reported experiments.

free parameters (1)

GNN and agent interaction hyperparameters
Model parameters and training choices required to achieve the reported performance lift.

axioms (1)

domain assumption Role-playing agents with distinct traits can produce realistic heterogeneous evaluations and information exchange that match real VC behavior
Invoked in the design of the multi-agent system and interaction module.

invented entities (1)

SimVC-CAS role-playing collective agents no independent evidence
purpose: To simulate group VC decision dynamics over investor networks
New system component introduced to reformulate the prediction task; no external falsifiable test of agent realism provided.

pith-pipeline@v0.9.0 · 5534 in / 1453 out tokens · 42543 ms · 2026-05-16T19:00:57.178530+00:00 · methodology

Beyond Isolated Investor: Predicting Startup Success via Roleplay-Based Collective Agents

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)