Data is the current limiting factor in AI. Infra is pretty good, and models simply need more training data to push the limits of what is possible. 1/4
Example: GPT-3 was trained on most of the internet’s available text data. But more data will probably result in a more performant model. So, where does one find more text data?? 2/4
One approach is to convert audio to text. @Spotify recently released audio-text data from 100,000 podcasts.
Another approach is to use synthetic generation, whereby one AI model generates data to train another AI model. 3/4
Companies that are already deploying AI models at scale have an inherent advantage in acquiring massive repositories of training data. Existing scale will likely lead to superior model performance in the near/medium-term.
Data advantages at work! 4/4
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Legacy companies are going to struggle with AI for two reasons: 1/8
First, they are risk-averse. Deep learning models are black-box, and it’s kind of impossible to explain why a model produced a specific output. Sometimes the output is totally unexpected. Sometimes it’s offensive. Remember Tay, Microsoft’s offensive chatbot? 2/8
This will inevitably confuse and anger the risk-averse middle managers at legacy companies. They will default to “no” because their incentive structure rewards stable, low-risk execution. 3/8
1900s: Economies of Scale
2000s: Network Effects
2020s: Data Advantages
1/14
Economies of Scale: In the 1900s, the dominant companies benefited from scale. Standard Oil's size enabled Rockefeller to negotiate railroad rebates, acquire early tank cars, etc. He could profitability sell oil at a price point lower than his competitors could produce it. 2/14
Economies of scale aren't as powerful in software because the underlying infrastructure components - internet, servers, etc. - are generally shared resources. The cost of compute isn't that much different for BigCo and SmallCo. And scale is available instantly. 3/14
Interesting thought by @sama on AI-enabled Moore's law of everything: "Imagine a world where, for decades, everything–housing, education, food, clothing, etc.–became half as expensive every two years." 1/11
Labor is expensive, and AI promises to reduce the cost of labor to something nearing zero. The result is lower prices and higher profit margins. 2/11
Thanks to the reduction in labor costs, coupled with net-new value creation, AI will likely be the most significant source of wealth creation in human history. Our research suggests that AI could add $30T to equity market capitalization over the next 15-20 years. 3/11