The AI Margin Tax: Why SaaS Math Breaks for Venture

Venture capital runs on a simple lie we all tell ourselves: if revenue is going up and the product feels inevitable, the unit economics will sort themselves out later.

That lie worked in SaaS because “later” mostly meant: keep shipping, keep selling, and your marginal cost asymptotically goes to zero. Serving the 10,000th user was basically free. So you could fund growth first, then let operating leverage do its thing.

AI breaks that bargain.

Not because the tech isn’t real. Because the cost structure is. Every useful action can carry a variable compute bill. If you don’t model that bill at the level where value is actually delivered, you can build something that looks like a rocket ship and still never produce a venture outcome.

The trap is subtle: we’re pricing AI startups with SaaS heuristics.

“SaaS” implies 80–90% gross margins, predictable renewals, and the comforting idea that once you’ve built the product, delivering it is just bits on a wire. A lot of AI companies are selling outcomes powered by rented intelligence. That means inference is not a rounding error. It’s COGS. And COGS that scales with usage changes everything: pricing, fundraising, valuation, and the revenue you need in 8–10 years to generate real returns.

So what should you measure?

Not “blended gross margin.” That’s too easy to game, especially early when usage is volatile and you can hide subsidies inside a single line item. The metric that matters is:

Contribution margin per AI-driven action.

Pick the atomic unit your customer pays for: a document processed, a claim adjudicated, a sales email generated, a security alert resolved, a customer ticket deflected. Then do the unglamorous accounting: revenue for that action minus inference, retrieval, tooling, human-in-the-loop, and support. If that number is negative, you’re not scaling a product. You’re scaling a cost.

Negative contribution margin isn’t always fatal. But it demands a very specific story: a short, owned path to positive margins via model routing, caching, distillation, cheaper model mixes, better retrieval, product constraints, and—most importantly—pricing that captures value rather than compute consumption.

If your plan is “inference will get cheaper,” you’re speculating on an industry-wide cost curve you don’t control. Even if you’re right, your competitors get the same benefit. That’s not a moat. That’s weather.

This is where the “AI margin tax” shows up.

In the SaaS world, $100M ARR with high margins could translate into a clean unicorn-plus outcome. In AI, $100M ARR with materially lower margins often gets a materially lower multiple. Same revenue. Different business. Different valuation. This is why so many founders are confused right now: they hit impressive top-line numbers and still get treated like the exit is capped.

Investors need to internalize this because it changes portfolio construction. Venture is a power law; a small number of outcomes carry the fund. That means your “winners” must be huge, not just successful. And huge is a function of exit value, which is a function of revenue and margin profile.

If you’re investing at pre-seed and underwriting SaaS-style multiples on SaaS-style margins, but the company is actually a 25–60% gross margin business, you’re quietly chopping your upside in half. You can’t make that up with vibes.

Now the question founders always ask: what revenue do we need in 8–10 years for venture returns?

Annoying answer: it depends on the multiple you can justify at exit, and the multiple you can justify depends on whether you look like a durable software business or a variable-cost services machine.

A practical way to think about it:

If you want a venture outcome, you likely need to be on a path to $200M–$500M ARR (US and global from day one targets) within a decade and demonstrate a credible march toward 70%+ gross margins, strong retention, and defensibility. Yes, there are exceptions, category leaders can command premiums. But if your gross margin stalls below 50–60%, you’ll need far more revenue to hit the same exit value, and buyers may still cap the multiple because the business doesn’t scale cleanly.

So what changes in fundraising?

For founders: stop leading with architecture and benchmarks. Lead with business model physics. Show contribution margin per action today, the levers that improve it, and the milestones where cost curves bend. Your deck should make it obvious you’re building leverage, not just shipping intelligence.

For investors: stop asking “how fast can this grow?” as the first question. Ask “what happens to gross margin at 10x usage?” Ask “who owns the cost curve?” Ask “if your model provider changes pricing, what breaks?” Then price the round like you believe the answers.

And here’s the uncomfortable conclusion: many AI startups should not be venture-backed.

If your TAM is modest, switching costs are low, margins won’t clear 60%+, and revenue scales linearly with compute or headcount, you may still build a great company. It just might be a bootstrapped company, a profitable niche business, or a strategic acquisition—not a fund-returning outcome.

AI doesn’t kill venture. It kills lazy venture math.

The new rule is brutally simple: if contribution margin per AI-driven action is unclear, negative, or “we’ll fix it later,” you’re not building a venture-scale asset. You’re building an increasingly expensive demo.

And the market eventually always collects.

Generative Biology Is Already Clinical. So Why Are Founders Still Sleeping?

Generate:Biomedicines just announced Phase 3 trials for GB-0895, an antibody entirely designed by AI, recruiting patients from 45 countries as of late 2025. Isomorphic Labs has human trials "very close." That's not hype. That's proof that AI-designed drugs work in humans.

And the market hasn't priced this in yet.

Generative biology, applying the same transformer architectures behind ChatGPT to protein design doesn't incrementally improve drug discovery. It compresses it. Traditional timelines: 6 years from target to first human dose. Generative biology: 18-24 months. That's not faster iteration. That's a category shift.

Here's what's actually happening: A handful of well-funded companies have already won the scaling race. Profluent's ProGen3 model demonstrated something critical that scaling laws (bigger models = better results) apply to protein design just like they do to LLMs. The company raised $106M in Series B funding in November 2025. EvolutionaryScale built ESM3, a 98-billion-parameter model trained on 2.78 billion proteins, and created novel GFP variants that simulate 500 million years of evolution computationally. Absci is validating 100,000+ antibody designs weekly in silico, reducing discovery cycles from years to months.

These aren't startups anymore. They're infrastructure.

The Market Opportunity Is Massive, But Concentrated

The AI protein design market is $1.5B today (2025) and grows to $7B by 2033 (25% CAGR). Protein engineering more broadly: $5B → $18B in the same window. But here's the friction: success requires vertical integration. Algorithms alone are defensible for exactly six months. What matters is the ability to design, synthesize, test, and iterate at scale: wet lab automation, manufacturing readiness, regulatory playbooks.

Generate raised $700M+ because it built all three. Profluent raised $150M because it owns the data and the model. Absci went public because it combined proprietary platform with clinical validation. The solo-algorithm play? Dead on arrival.

This matters for founders evaluating entry points. The winning thesis isn't "better protein design." It's "compressed drug discovery + manufacturing at scale + regulatory clarity." Pick one of those three and you're a feature. Own all three and you're a platform.

Follow the Partnerships, Not the Press Releases

Novartis: $1B deal with Generate:Biomedicines (Sept 2024). Bristol Myers Squibb: $400M potential with AI Proteins (Dec 2024). Eli Lilly + Novartis: Both partnered with Isomorphic Labs. Corteva Agrisciences: Multi-year collab with Profluent on crop gene editing.

These deals aren't about technology proving. They're about risk transfer. When Novartis commits $1B and strategic alignment, they're not hedging on whether AI-designed proteins work they're betting on speed-to-market mattering more than incremental efficacy improvements. That's a macro signal: pharma's risk tolerance is shifting from "is it better?" to "can we deploy it in 36 months?"

For investors, this is the tell. Follow where the check sizes are growing, not where the valuations are highest.

The Real Risk Isn't Technical—It's Regulatory and Biosecurity

Can generative biology design novel proteins? Yes. Can those proteins fold predictably? Mostly. Will they work in vivo? That's the test happening right now in Phase 3 trials.

But the bigger risk is slower: regulatory alignment. Agencies are adapting, but they're not leading. Gene therapy has 3,200 trials globally. Only a fraction navigated the approval gauntlet successfully. AI-designed therapeutics will face the same friction unless founders invest heavily in regulatory affairs early not late.

And then there's dual-use risk. Generative biology lowers barriers to misuse. AI models could design pathogens or toxins for bad actors. This isn't hypothetical, it's why 94% of countries lack biosecurity governance frameworks. Founders that build secure-by-design architectures and engage proactively with regulators on dual-use mitigation will differentiate themselves sharply from those that don't.

The Next 24 Months: Clinical Data Wins. Everything Else Is Narrative

Generate's Phase 3 readout will determine whether the market reprices generative biology from "interesting" to "inevitable." If it works, expect a flood of follow-on funding, accelerated IND filings, and a stampede of partnerships. If it fails or if safety signals emerge you'll see valuation compression and investor skepticism that lasts years.

For founders: don't chase market size. Chase clinical validation. For investors: don't chase valuations. Chase clinical milestones.

The inflection point is here. The question is whether you're positioned to capture it or just watch it pass.

Moltbook Isn’t a Reverse Turing Test — It’s a Containment Test

Naval called Moltbook the “new reverse Turing test,” and everyone immediately treated it like a profound milestone. I think it’s something else: a live-fire test of whether we can contain agentic systems once they’re networked together.

Let’s be precise. Moltbook is an AI-only social platform, roughly “Reddit, but for agents,” where humans can watch but not participate. The pitch is simple: observe how AI agents behave socially when left alone. Naval’s label is elegant because it implies the agents are now the judges—humans are the odd ones out.

But if you’re a founder or an operator, you should ignore the poetry and ask: what is the product really doing to the world?

Moltbook’s real innovation is not “AI social behavior.” It’s a new topology: lots of agents, from different builders, connected in a public arena where they can feed each other instructions, links, and narratives at scale. That’s not a reverse Turing test. It’s a coordination surface.​

And coordination surfaces create externalities.

In the old internet, humans spammed humans. In the new internet, agents will spam agents—except “spam” won’t just be annoying; it will be executable. If you give agents permissions (email, calendars, bank access, code execution, “tools”), and then you let them ingest untrusted content from a network like Moltbook, you are building the conditions for what security folks call the “lethal trifecta.”

This is where the discussion gets serious.

Forbes contributor Amir Husain’s critique is basically a warning about permissions: people are already connecting agents to real systems—home devices, accounts, encrypted messages, emails, calendars—and then letting those agents interact with unknown agents in a shared environment. That’s an attack surface, not a party trick. If the platform enables indirect prompt injection—malicious content that causes downstream agents to leak secrets or take unintended actions—then your “social experiment” becomes a supply chain problem.

You don’t need science fiction for this to go wrong. You just need one agent that can persuade another agent to do something slightly dumb, repeatedly, across thousands of interactions. We already know that when systems combine high permissions, external content ingestion, and weak boundaries, bad things happen—fast.

So here’s my different perspective:

Moltbook isn’t proving that agents are becoming “more human.” It’s proving that we’re about to repeat the Web2 security arc—except the users are autonomous processes with tools, and the cost of an error is not just misinformation, it’s action.

And yes, that matters for investors.

I’m optimizing for fund outcomes within a horizon, not for philosophical truth at year 12. The investable question is not “is this emergent intelligence?” It’s: “does this create durable value that survives the cleanup required to make it safe?”

If Moltbook becomes the standard sandbox for red-teaming agents—great. If it becomes the public square where autonomous tool-using systems learn adversarial persuasion from each other, that’s not a product category; that’s a systemic risk generator, and regulators will come for everyone adjacent to it.

What should founders do?

First, treat any agent-to-agent network as hostile-by-default. Second, sandbox tools like your company depends on it—because it does. Third, stop marketing autonomy until you can measure and bound it, because markets pay for narratives on the way up, and punish you when the story breaks.

Naval’s phrase is catchy. But the real test isn’t whether humans can still tell who’s who.

The real test is whether we can build agent networks that don’t turn “conversation” into “compromise.”

Oxford says “gut.” I say “objective + proof.”

Oxford’s The Impact of Artificial Intelligence on Venture Capital argues AI accelerates sourcing and diligence, but investment decisions stay human because durable moats are socially grounded conviction, gut feeling, and networks.

I agree with the workflow diagnosis. I disagree with the implied endgame.

Not because “gut” is fake—but because “gut” is often a label we apply when we haven’t defined success tightly enough, or when we don’t have a measurement loop that forces our beliefs to confront outcomes.

Dealflow is getting commoditized. The edge is moving.

AI expands visibility, speeds up pipelines, and pushes the industry toward shared tools and shared feeds. When everyone can scan more of the world, “who saw it first” decays.

But convergence of inputs does not imply convergence of results. The edge moves from access to learning rate.

The outlier problem isn’t mystical. It’s an evaluation problem.

Oxford’s strongest point is that the power-law outliers are indistinguishable from “just bad” in the moment, and that humans use conviction to step into ambiguity.

I accept that premise and I still think the conclusion is wrong.

Because “conviction” is not a supernatural faculty. It’s a policy under uncertainty. And policies can be evaluated.

If your decision rule can’t be backtested, it’s not conviction. It’s narrative.

Don’t try to read souls. Build signals you can audit.

Some firms try to extract psychology from language data. Sometimes it works as a cue; often it’s noisy. And founders adapt as soon as they sense the scoring system.

So the goal isn’t “measure personality with high accuracy.” The goal is: build signals that are legible, repeatable, falsifiable and then combine them with a process that forces updates when reality disagrees.

Verification beats vibes.

If founders optimize public narratives, then naive text scoring collapses into a Goodhart trap.

The difference between toy AI and investable AI is verification: triangulate claims, anchor them in time, reject numbers that can’t be sourced, and penalize inconsistency across evidence.

That’s how you turn unstructured noise into features you can actually test.

Status is a market feature—not a human moat.

Networks and brand matter because markets respond to them—follow-on capital, recruiting pull, distribution, acquisition gravity.

So yes: status belongs in the model.

But modeling status is not the same thing as needing a human network as the enduring edge. One is an input signal. The other is a claim about irreducible advantage.

If an effect is systematic, it’s modelable.

Objective function: I’m optimizing for fund outcomes.

A lot of debates about “AI can’t do VC” hide an objective mismatch.

If your target is “eventual truth at year 12,” you’ll privilege a certain kind of human judgment. If your target is “realizable outcomes within a fund horizon,” you’ll build a different machine.

I’m comfortable modeling hype—not because fundamentals don’t matter, but because time and liquidity are part of the label. Markets pay for narratives before they pay for final verdicts, and funds get paid on the path, not just the destination.

The punchline

Oxford is right about current practice: AI reshapes the funnel, while humans still own the final decision and accountability.

My reaction is that this is not a permanent moat. It’s a temporary equilibrium.

Define success precisely. Build signals that survive verification. Backtest honestly. Update fast.

That’s not gut.

That’s an investing operating system.