Vertical AI Companies: The Second Wave
The First Wave
It’s been roughly 15 months since the emergence of ChatGPT (both as a product and as an API).
Since then, we have seen an explosion of products/companies in both B2B and consumer-land that are leveraging the magical capabilities of LLMs and GenAI. The spectrum of ambition has been impressive. Companies are trying to re-imagine everything from fundamental technology experiences (new search engine, new browser) to changing our conception of prosaic workflows (booking travel, doing taxes, getting feedback on content).
The first generation of these LLM-powered products were quite often thin-wrappers around GPT-4 class models. At the risk of generalization, these products were characterized by:
Little proprietary data: There was no real proprietary data being used to train or fine-tune the models.
Standard architecture: The technical architecture was often pretty simplistic with the extent of the sophistication being RAG (in terms of context being passed to the LLMs) and/or prompting techniques (CoT, recursive prompting, Best of N prompts etc.)
Boring interfaces: Most of the interfaces hewed to what we expected of a text-generation model. The interfaces were often chat-oriented and surfaced as text-heavy.
All this led to a class of products that simultaneously felt pretty cool but also a quite flat. These products didn’t have particularly deep moats and also generally suffered from low long-term engagement. To put it simply, none of these products offered something vastly better than a chat interface over a SOTA model (ChatGPT, Gemini Pro etc.)
This essentially led to one of the 3 scenarios:
A lack of a moat and hence a flood of players into the same space (how many LLM-powered travel planning apps are out there?).
Insufficient differentiation from the SOTA models and their interfaces. This is why most people just prefer to use ChatGPT/Gemini/Claude etc.
Incumbents bolting on LLMs to their current products and taking the air out of the room.
In and of itself, this isn’t surprising because we didn’t quite understand how best to utilize the magical LLM technology and use it to make differentiated, engaging, and beautiful products.
That is actively changing now as we get into the Second Act and figure how to actually build “thick AI vertical companies”.
Below are some thoughts on how to survive and thrive in this second wave.
The Second Wave
The most important aspect of building a thicker AI company is to change your mindset. You have to think of the foundation/base model as a starting component and not as the finished product. You have to be comfortable learning about how these models work at a deeper level and how you can retrain/tune/massage them into doing what you need them to do. You have to be confident enough to change the (meta) model architecture. And finally you have to really think hard about the interaction modalities and their interplay with the stochastic nature of the system.
Training Data:
Perhaps controversially, I believe that the most important part of building a thick AI company is to assemble a proprietary dataset. Collect high quality data that is relevant to your product. And collect as much of it as possible.
There are many approaches to collecting a dataset that you can use for model training/tuning:
Hiring experts: You can literally hire experts in your field to collect a dataset. For instance, if you building a travel app, you can consider hiring travel agents, travel planners etc. to create a curated dataset that you can use. Ditto for CPAs, Financial Planners, Therapists, Doctors and so on.
Leveraging your own organization: If you are trying to create a product within a larger organization, a great way could just be to get your company to do an AI Data day. You basically sit and create data for the model (as an aside: there is a whole company to be built to enable this).
Pruning open-source datasets: There is a lot of prior data out there that can be cleaned up and massaged to suit your particular purpose.
Buying datasets: There are a lot of interesting datasets out there that are available to purchase (which can then be cleaned up or pruned)
Synthetic datasets: This is early and emerging but a number of the more adept organizations are utilizing models to synthetically create domain-specific data (with a seed from human-obtained data). There is a lot of room here to be creative.
Too many founders believe that they are data-poor without actually pushing the limit. Some of the most interesting model advancements over the last 3-4 years have come as a result of someone figuring out an interesting take on how to massage/access data.
Model Architecture:
There has been a reluctance thus far for teams to really get in there and treat foundation models as a starting ****point from which to alter their composition and behavior. There are now quite a few methods that allow for tweaking, fine-tuning and tailoring these models that are much more effective than simply passing them long context windows and good prompts. If you start doing a number of these techniques like LORA etc. the lines between training your own model and using a SOTA foundation model really start to blur.
The key aspect is to not treat these models reverentially. Every strong engineering team is fundamentally capable of utilizing these advanced techniques for model iteration. This is no longer the sole domain of deep AI companies. Below is a short list of potential techniques to think about but the list is much larger:
In-depth fine-tuning: Tailoring models more closely to specific tasks or domains to enhance performance and relevance.
Low-rank adaptation (LORA): A method that allows for efficient fine-tuning of large models by adjusting a small subset of parameters, making it more accessible for companies to customize AI models without extensive computational resources.
Multi-Step Planning, Early Reasoning: The best example of this recently is Devin which is able to string together a long chain of actions using meta-structures on top of the models
Compound architectures: Combining multiple models or techniques to achieve more complex, nuanced, and effective AI functionalities. This is a great summary (and h/t for the image below)
Interfaces & Modalities:
AI-first design™️ is something we are just starting to understand but I think it’s important to not try to jam old modalities and interfaces onto new fundamental capabilities.
The interfaces (both for input and output) have the capacity to be significantly different than our current mental perception of an LLM-based chatbot. If anything, you want to swing the pendulum hard away from being a back-and-forth text-based interface.
It is also quite likely that there will be a class of products that will be much more UI-light and async because of AI. They won’t need rich UI/UX to manipulate workflows or tasks as more and more of the computational and directive bits will be done in the background.
Other aspects of interface design that are going to be critical:
Embrace stochasticity: LLMs are intrinsically stochastic. You should embrace this constraint and not try to hide it from the user. Users are surprisingly understanding of this limitation.
Augment with humans: Some of the more interesting products have sophisticated fallbacks to humans that works well for corner cases.
Multi-modality (again). Images. Videos. Audio. Use all of it.
More fundamentally, most founders are treating design as an afterthought, which leaves a ton on the table. If you look and feel like ChatGPT (or some minor variant), then you have no chance of leaving a deep imprint on the user’s consciousness about your product’s magic.
Compounding feedback loops
The powerful part of these next generation of apps is when you can get all three components working in concert in a positive feedback loop.
A differentiated interface should allow you to collect more new data for training the app (and certainly for fine-tuning). This increasing proprietary dataset will (at a certain point of scale) allow you to be at the forefront of experimenting with new model architectures. And so the world rotates faster for you.
The Analogy to Mobile circa 2012
The first phase of mobile apps translated web 1.0/2.0 apps onto the mobile web interface. But the second wave of apps really started to deeply understand what was uniquely possible through mobile as a technology and platform. It required understand the intricacies of the technology and really massaging it to build magical experiences. It also required fundamentally new design paradigms. Then we got apps like Uber, Instagram and so on.
This also maps to a time when startups talked about being a “mobile company”. Today everyone building a product needs to be a “mobile” company. It is table stakes.
Where we go from here
The ecosystem is moving so quickly that it’s certainly possible that all of this will feel hopefully out of date by the end of the year but I do think there is a secular shift underway in how we need to think about building the next generation of Vertical AI apps. At the very least, we will not be going back to the vanilla RAG + GPT-4 world. There is a lot of possibility out there — let’s go get it.
Thanks to Elad Gil, Amjad Masad, Arnaud Benard, Evan Tana and Gopal Raman for feedback!