This version 1 research applies machine learning to identify high-potential AI startups from 2017-2019, yielding significant insights for investment decision-making.
Project Framework
We developed a machine learning methodology to identify promising AI ventures across two cohorts: 2017-2018 (475 companies) and 2019 (329 companies).
Our approach incorporated:
-
Six predictive variables: Industries, Company Description, Founder Biography, Founder Gender, Location, and Educational Background
-
A dual-model ensemble combining Random Forest and XGBoost algorithms
-
Advanced text vectorization for unstructured data
Portfolio Performance
Our model selected 15 companies across both time periods:
2017-2018 Selections (10): Jerry, Health Note, Cylera, Deep Cognition, Determined AI, NoTraffic, MovieBot, SupplyHive, Kami Vision, Rowzzy
2019 Selections (5): Eleos Health, Anyscale, Baseten, Anvilogic, Fairmatic
Current Performance:
-
6 companies (40%) achieved valuations exceeding $500M
-
3 companies (20%) demonstrate strong growth trajectories
-
3 companies (20%) show steady growth
-
3 companies (20%) have ceased operations
This 40% high-performer rate significantly outperforms typical venture capital success rates of 10-20%, while the 20% failure rate is substantially lower than industry averages of 75%. This do not factoring in various constraints of real investing.
Key Investment Domains
Four predominant themes emerged:
-
Enterprise AI Infrastructure (Determined AI, Anyscale)
-
Healthcare AI Applications (Eleos Health, Health Note)
-
Security Solutions (Cylera, Anvilogic)
-
Financial Technology (Jerry, Fairmatic)
Investment Implications
Successful AI ventures consistently demonstrate:
-
Enterprise-focused solutions with clear value propositions
-
Technical excellence within founding teams
-
Strategic presence in major technology ecosystems
While our model demonstrates strong predictive capability, it remains a decision support tool rather than a replacement for comprehensive due diligence.
We will continue to do more and larger permutations in AI and work larger geographies and sectors and publish the results once they are done.
Disclaimer: This analysis is for educational purposes only. Past performance does not guarantee future outcomes.