As predicted, software has eaten the world. But AI will subsume it.
The next wave of generational companies will be pure AI-shops (0/n)
Tech falls into 4 eras: Mainframes, chips, personal computers, software (pre-web, Web1/2, SAAS)
In each era, there were tailwinds for company building. Later, these tailwinds became headwinds - you were competing against well-established incumbents in saturated mkts. (1/n)
The uniqueness of the "software era" is its low costs. Building a chip company requires at least 10s of millions in capital. Building a software app costs a handful of AWS credits.
As a result, the software era has birthed a record number of generational companies (2/n)
But tailwinds have begun to fade. This is largely driven by:
1. Market Saturation for both SAAS & consumer ad-based products 2. Changing macro environment. (high inflation with rising interest rates).
There’s less alpha in starting a traditional software company (3/n)
Meanwhile, new eras of tech are marked by a number of factors.
1. Higher barriers to entry & slowdown of mkt growth in the previous era 2. Research breakthroughs / massive consumer behavior shift 3. Bigger market opportunities than ever before (4/n)
Human Augmentation + Automation w/ AI will be the next era of tech.
1. Existing moats are massive & market slowdown for SAAS & consumer-attention products 2. Breakthroughs in Scaling NN's 3. Replacement of human labor is the largest mkt opportunity in humanity's history (5/n)
We are already seeing the augmentation/replacement of graphic artists with @OpenAI’s #dalle2 and coders with #copilot. Another great example is @AdeptAILabs, building language-based tools to boost productivity for use of existing software products. (6/n)
Most existing “AI companies” use AI to improve their products. There has yet to be a massive company that is a pure AI play. They mark a departure from modern venture and a return to older times of large capital costs (for compute & data acquisition) (7/n)
In the short term, with high labor costs, these companies will have a massive advantage given their core product is replacing labor hours. For eng/researcher hiring, AI companies gain an edge of “excitement” over traditional software. (8/n)
They require fast growth to acquire data moats, to train even larger models, building an expensive yet powerful flywheel and barriers for new competitors.
Even more enticing is that the market winners will have accumulated enough data & researchers to have a shot at AGI (9/n)
This doesn't mean that the current dominant players (FAAMG+) will become obsolete. Many of them have survived substantial era changes. To remain relevant, they will need to incorporate more and more sophisticated AI (as many of them are doing). (10/n)
And this is not to say that it will be impossible to start traditional software companies. It will just be more difficult than in the previous few decades and less value additive than AI. (11/12)
But my money is on the next generation of AI companies fueled by years of incredible Deep Learning research. (12/12)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Standard prompting libraries use variants of “f-strings” with subbed-in inputs.
For us, a prompt is defined as a function that maps some set of inputs X and a token budget n to some string, s:
p(X, n) = s
We call this operation "rendering"
(2/12)
For example, my inputs, X, could include conversation history, contents of the current file, chunks of documentation, and codebase context we deem relevant.
This sums to 100K tokens. But the budget we are working with may just be 4000 tokens.
We've seen two key advantages of Turbopuffer with no perf degradation:
1. Normal vector database pricing makes no sense for our workloads (lots of moderate-sized indices). 2. The normal “pods” or cluster-based indices (of Pinecone for example) add unnecessary complexity
(2/10)
Most vector databases store the indices in memory.
For older use-cases, this made sense A given customer will have several large vector indices with consistently high usage on each index.
And the index should be in memory for high-throughput/low-latency querying.
People claim LLM knowledge distillation is trivial with logprobs, but that's not quite right...
It's very tricky to distill between different tokenizers. [1]
Internally, we've solved this with a clever algorithm we called tokenization transfer
(1/7)
To start, we needed to build a sophisticated primitive called the "Logmass Trie"
It's an extended Trie where each edge not only contains a character but a weight that represents the "log probability" of that character conditional on the string thus far
(2/7)
This edge weight is just an estimate.
But it must satisfy the constraint that for a contained string X, summing the log probabilities of the edges on the path to X gives the log probability of X
(3/7)