VCBench: Benchmarking LLMs in Venture Capital

· 2025 · cs.AI · arXiv 2509.14448

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Benchmarks such as SWE-bench and ARC-AGI demonstrate how shared datasets accelerate progress toward artificial general intelligence (AGI). We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC), a domain where signals are sparse, outcomes are uncertain, and even top investors perform modestly. At inception, the market index achieves a precision of 1.9%. Y Combinator outperforms the index by a factor of 1.7x, while tier-1 firms are 2.9x better. VCBench provides 9,000 anonymized founder profiles, standardized to preserve predictive features while resisting identity leakage, with adversarial tests showing more than 90% reduction in re-identification risk. We evaluate nine state-of-the-art large language models (LLMs). DeepSeek-V3 delivers over six times the baseline precision, GPT-4o achieves the highest F0.5, and most models surpass human benchmarks. Designed as a public and evolving resource available at vcbench.com, VCBench establishes a community-driven standard for reproducible and privacy-preserving evaluation of AGI in early-stage venture forecasting.

representative citing papers

IPO Finance Agent: Benchmark of LLM Financial Analysts Beyond Finance Agent v2, with Automated Rubric Generation, on the SpaceX (SPCX) IPO

cs.AI · 2026-06-22 · unverdicted · novelty 6.0

IPO Finance Agent benchmarks LLMs on SpaceX S-1 questions with contextual retrieval and auto-generated rubrics, reporting up to 79.8% accuracy and better cost-efficiency than prior Finance Agent v2 entries.

YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches

cs.LG · 2026-04-01 · accept · novelty 6.0

YC Bench is a new live benchmark that evaluates forecasting models for startup outperformance within YC batches using a short-term Pre-Demo Day Score derived from public traction signals.

citing papers explorer

Showing 2 of 2 citing papers.

IPO Finance Agent: Benchmark of LLM Financial Analysts Beyond Finance Agent v2, with Automated Rubric Generation, on the SpaceX (SPCX) IPO cs.AI · 2026-06-22 · unverdicted · none · ref 34 · internal anchor
IPO Finance Agent benchmarks LLMs on SpaceX S-1 questions with contextual retrieval and auto-generated rubrics, reporting up to 79.8% accuracy and better cost-efficiency than prior Finance Agent v2 entries.
YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator Batches cs.LG · 2026-04-01 · accept · none · ref 3 · internal anchor
YC Bench is a new live benchmark that evaluates forecasting models for startup outperformance within YC batches using a short-term Pre-Demo Day Score derived from public traction signals.

VCBench: Benchmarking LLMs in Venture Capital

fields

years

verdicts

representative citing papers

citing papers explorer