I’ve been using  Nomic a lot to analyze unstructured data, such as looking for coverage gaps among the companies advertising on  OpenAds’ network.

The riskiest startups to found, from an AI analysis of YC's entire portfolio

submited by
Style Pass
2024-09-25 17:30:07

I’ve been using Nomic a lot to analyze unstructured data, such as looking for coverage gaps among the companies advertising on OpenAds’ network. Out of curiosity, I scraped and visualized ~5,000 YC companies, their descriptions, and their outcomes. TL;DR: here’s the app to analyze any startup by its description.

This works by generating embeddings 1 of each company’s description; reducing the dimensionality; visualizing the resulting clusters and inferring labels for them. The closer two dots appear, the more related the companies’ descriptions.

There are obvious clusters: the blue retail cluster is all the 10-minute grocery delivery startups founded after DoorDash. Aviation’s purple cluster is illustrative. The dense center of the southern subcluster is all electric passenger aircraft, and moving outward leads to more varied companies, like supersonic UAVs. The northeast aviation subcluster is space tech.

If we filter by YC batch we can see the post-ChatGPT cohorts (S23+). We see which areas got hollowed out (retail, fintech, education) and which new areas are hot (Cellular engineering, anything to do with LLMs, AI for sales and customer support). At a glance, there’s actually a very wide variety of AI startups given the sparsity of their cluster, with some local clumps for similar LLM eval startups.

Leave a Comment