AGI Will Replace Average VCs. The Best Ones? Different Game.

The performance gap between tier-1 human VCs and current AI on startup selection isn't what you think. VCBench: a new standardized benchmark where both humans and LLMs evaluate 9,000 anonymized founder profiles, shows top VCs achieving 5.6% precision. GPT-4o hit 29.1%. DeepSeek-V3 reached 59.1% (though with brutal 3% recall, meaning it almost never said "yes").[1]​

That's not a rounding error. It's a 5-10x gap in precision, the metric that matters most in VC, where false positives (bad investments) are far costlier than false negatives (missed deals).[1]​

But here's what the paper doesn't solve: VCBench inflated the success rate from real-world 1.9% to 9% for statistical stability, and precision doesn't scale linearly when you drop the base rate back down. The benchmark also can't test sourcing, founder relationships, or board-level value-add, all critical to real fund performance. And there's a subtle time-travel problem: models might be exploiting macro trend knowledge (e.g., "crypto founder 2020-2022 = likely exit") rather than true founder quality signals.[2]​

Still, the directional message is clear: there is measurable, extractable signal in structured founder data that LLMs capture better than human intuition. The narrative that "AI will augment but never replace VCs" is comforting and wrong. The question isn't if AGI venture capitalists will exist—it's when they cross 15-20% unicorn hit rates in live portfolios (double the best human benchmark) and what that phase transition does to the rest of us.​

The math is brutal for average funds

Firebolt Ventures has been cited as leading the pack at a 10.1% unicorn hit rate—13 unicorns from 129 investments since 2020. (Stanford GSB VCI-backed analysis, as shared publicly) Andreessen Horowitz sits at 5.5% on that same "since 2020" hit-rate framing, albeit at far larger volume. And importantly: Sequoia fell just below the 5% cutoff on that ranking—less because of a lack of wins and more because high volume dilutes hit rate.[3]​

The 2017 vintage—now mature enough to score—shows top-decile funds hitting 4.22x TVPI. Median? 1.72x. Most venture outcomes are random noise dressed up as strategy.​

Here's the punchline: PitchBook's 20-year LP study has been summarized as finding that even highly skilled manager selectors (those with 40%+ hit rates at picking top-quartile funds) generate only ~0.61% additional annual returns, and that skilled selection beats random portfolios ~98.1% of the time in VC (vs. ~99.9% in buyouts). (PitchBook analysis, as summarized).​

If the best fund pickers in the world can barely separate signal from noise, what does that say about VC selection itself?​

AGI VCs won't need warm intros

Current ML research suggests models can identify systematic misallocation even within the set of companies VCs already fund. In "Venture Capital (Mis)Allocation in the Age of AI," the median VC-backed company ranks at the 83rd percentile of model-predicted exit probability—meaning VCs are directionally good, but still leave money on the table. (Lyonnet & Stern, 2022). Within the same industries and locations, the authors estimate that reallocating toward the model's top picks would increase VCs' imputed MOIC by ~50%.​

That alpha exists because human VCs are bottlenecked by:

Information processing limits. Partners evaluate ~200-500 companies/year. An AGI system can scan orders of magnitude more continuously.​

Network constraints. You can't invest in founders you never meet. AGI doesn't need warm intros—it can surface weak signals from GitHub velocity, hiring patterns, or web/social-traffic deltas before the traditional network even sees the deck.​

Cognitive biases. We over-index on storytelling, pedigree, and pattern-matching to our last winner. Algorithms don't care if the founder went to Stanford or speaks confidently. They care about predictors of tail outcomes.​

Bessemer's famous Anti-Portfolio—the deals they passed on Google, PayPal, eBay, Coinbase is proof that even elite judgment systematically misfires. If the misses are predictable in hindsight, they're predictable in foresight given the right model.​

The five gaps closing faster than expected

AGI isn't here yet because five bottlenecks remain:

Continual learning. Current models largely freeze after training. A real VC learns from every pitch, every exit, every pivot. Research directions like "Nested Learning" have been proposed as pathways toward continual learning, but it's still not a solved, production-default capability.​

Visual perception. Evaluating pitch decks, product demos, team dynamics from video requires true multimodal understanding. Progress is real, but "human-level" is not the default baseline yet.​

Hallucination reduction. For VC diligence—where one wrong fact about IP or founder background kills the deal—today's hallucination profile is still too risky. Instead of claiming a universal "96% reduction," the defensible claim is that retrieval-augmented generation plus verification/guardrails can sharply reduce hallucinations in practice, with the magnitude depending on corpus quality and evaluation method. ​

Complex planning. Apple's research suggests reasoning models can collapse beyond certain complexity thresholds; venture investing is a 7-10 year planning problem through pivots, rounds, and market shifts.​

Causal reasoning. Correlation doesn't answer "If we invest $2M vs. $1M, what happens?" Causal forests and double ML estimate treatment effects while controlling for confounders. The infrastructure exists; it's not yet integrated into frontier LLMs. Give it 18 months.​

Unlike the theoretical barriers to general AGI (which may require paradigm shifts), the barriers to an AGI VC are engineering problems with known solutions.​

The phase transition nobody's pricing in

Hugo Duminil-Copin won the Fields Medal for proving how percolation works: below a critical threshold, clusters stay small. Above it, a giant component suddenly dominates. That's not a metaphor—it's a rigorous model of network effects.​

Hypothesis (not settled fact): once AGI-allocated capital crosses something like 15-25% of total VC AUM, network effects could create nonlinear disadvantage for human-only VCs in deal flow access and selection quality. Why? Because:​

Algorithmic funds identify high-signal companies before they hit the traditional fundraising circuit. If you're a founder and a fund can produce a high-conviction term sheet on a dramatically shorter clock—with clear, inspectable reasoning—you take the meeting.​

Network effects compound. The AGI with the best proprietary outcome data (rejected deals, partner notes, failed pivots) trains better models. That attracts better founders. Which generates better data. Repeat.​

LPs will demand quantitative benchmarks. "Show me your out-of-sample precision vs. the AGI baseline" becomes table stakes. Funds that can't answer get cut.​

The first AGI VC to hit 15% unicorn rates and 6-8x TVPI will trigger the cascade. My estimate: 2028-2029 for narrow domains (B2B SaaS seed deals), 2030-2032 for generalist funds. That's not decades—it's one fund cycle.​

What survives: relationship alpha and judgment at the edge

The AGI VC will systematically crush humans on sourcing, diligence, and statistical selection. What it won't replace—at least initially:

Founder trust and warm intros. Reputation still opens doors. An algorithm can't build years of relationship capital overnight.​

Strategic support and crisis management. Board-level judgment calls, operational firefighting, ego management in founder conflicts—those require human nuance.​

Novel situations outside the training distribution. Unprecedented technologies, regulatory black swans, geopolitical shocks. When there's no historical pattern to learn from, you need human synthesis.​

VCs will bifurcate: algorithmic funds competing on data/modeling edge and speed, versus relationship boutiques offering founder services and accepting lower returns. The middle—firms that do neither exceptionally—will get squeezed out.​

Operating system for the transition

If you're building or managing a fund today, three moves matter:

1. Build proprietary outcome data now. The best training set isn't Crunchbase—it's your rejected deal flow with notes, your portfolio pivots, your failed companies' post-mortems. That's the moat external models can't replicate. Track every pitch, every IC decision, every update. Structure it for ML ingestion.​

2. Instrument your decision process. Precommit to hypotheses ("We think founder X will succeed because Y"). Log the reasoning. Compare predicted vs. actual outcomes quarterly. This builds the feedback loop that lets you detect when your mental model is miscalibrated—and when an algorithm beats you.​

3. Segment where you add unique value vs. where you're replaceable. If your edge is "I know this space and can move fast," you're exposed. If it's "founders trust me in a crisis and I've navigated three pivots with them," you're defensible. Be honest about which deals came from relationship alpha versus statistical pattern-matching. Double down on the former; automate the latter.​

The real test

In three years, when an AGI fund publishes live performance data showing 12-15% unicorn rates and 5-6x TVPI, the LP conversation changes overnight. Not because the technology is elegant—because the returns are real and the process is transparent.​

That's the moment VCs have to answer: What alpha do we generate that a model can't? For many funds, the answer will be uncomfortable. For the best ones—the ones who've always known that determination, speed, and earned insight compound faster than credentials—it'll be clarifying.​

The AGI VC era doesn't kill venture capital. It kills the pretense that average judgment plus a warm network equals outperformance. What's left is a smaller, sharper game where human edge has to be provable, not performative.​

And if you can't articulate your edge in a sentence—quantifiably, with evidence—you're not competing with other humans anymore. You're competing with an algorithm that already sees your blind spots better than you do.​

  1. https://arxiv.org/pdf/2509.14448.pdf
  2. https://www.reddit.com/r/learnmachinelearning/comments/1no8xji/vcbench_new_benchmark_shows_llms_can_predict/
  3. https://www.linkedin.com/posts/ilyavcandpe_top-unicorn-investors-by-hit-rate-since-2020-activity-7362200145880367104-7zTv