1/15 I visited a physical bookstore after a long time yesterday, and had this revelation about knowledge distribution.
2/15: Knowledge on the internet is like a free market. Platforms serve you what they think you want (e.g. Netflix, Youtube, Instagram).
3/15: Knowledge before was a highly regulated market - you needed authority figures (publishers) to vouch for the quality of your work before it could be available on shelves. Still a market though, because more popular books would get more shelf space.
4/15: A complete free market has the obvious advantage that there's less gaps in the market - content is fresher, and appeals to very niche demographics...
But at the obvious cost of quality. Content that is vile, inappropriate, hateful, etc do exist.
5/15: Trying to establish partial regulation online is tricky because (1) regulation in many ways is not automatic enough to scale (besides porn filters, etc) (2) users have transparency so they complain when they disagree with regulation (YT takedowns, etc)
6/15: But even today, people are more likely to trust a book than something they read online. With deregulation, you lose that trust (fake news).
7/15: Doing regulation at scale is a tricky problem from a Machine Learning perspective. A high precision (high trust) solution impedes on free speech (low recall), creating a quality vs free speech tradeoff.
8/15: Another key aspect is about what content gets exposure (ranking) over just what content is allowed (regulation). Products often optimize on engagement vs expert regulation. An algorithm might put a listicle on the front page but an editor wouldn't.
9/15: A news editor cares about his expertise and his responsibility to the reader, regardless of what the user wants. He supplies what the user *needs*, not what they *want*.
10/15: However, in capitalism, that's fighting a losing battle. Capitalism rewards wants, not needs. More views is (usually) more money, not societal impact. How do we contend with that?
11/15: I think there are ways to tackle this problem with technology. Can we train on data only from an exclusive subset of the population whose opinions we trust?
Can we move to a subscription model to not have to rely on viewership numbers?
12/15: Non-tech heavy solutions that work already exist: Reddit and Wikipedia both have trusted "moderators" who regulate content quality cheaply and scalably very well. Maybe we can empower those moderators to be the editor and decide which posts/pages they recommend.
13/15: Another unintuitive idea: cap the amount of content like a newspaper. If users know they only get access to a select few high quality things, maybe they'll value quality over quantity, and retention will be higher (?)
14/15: The physical bookstore visit reminded me that there's so much fascinating content I'd never look up organically. But because of the way bookstores are organized, I got to delve into old South Indian folklore to graphic novels about capitalism.
15/15: That made me think - very rarely do I discover such niche timeless high quality content online, and maybe there are opportunities to change that!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
We just dropped a 12 page AI report on how ~500 execs at US enterprises use generative AI.
I read it all so you don't have to. Top 8 takeaways:
Anthropic is the #1 model provider in the enterprise, with 40% of ~$37B spend, with OpenAI dropping to #2.
1/8
On overall AI spend.
Generative AI has captured ~6% of software spend at $37B, growing ~3.2x YoY. Investments are coming to fruition and buyers are seeing results.
2/8
On where the spend goes.
Companies are using off-the-shelf models more than they're training their own. Horizontal AI tools like ChatGPT Enterprise, Claude for Work, Msft Copilot and Glean have exploded. "Departmental" AI like Cursor and Github Copilot also sees a huge boost.
This new DeepMind research shows just how broken vector search is.
Turns out some docs in your index are theoretically incapable of being retrieved by vector search, given a certain dimension count of the embedding.
Plain old BM25 from 1994 outperforms it on recall.
1/4
This result gives me a lot of joy as a search nerd for more than a decade.
Haters will say that the dataset the authors created, LIMIT, is synthetic and unrealistic, but this has been my observation building search systems at Google / Glean.
Vector Search was popularized as an approachable drop-in search since OpenAI embeddings grew in popularity, but has clear limitations in production settings.
Even aside from this result, showing it just misses certain docs constantly, it
– doesn't search for concepts well
– often retrieves similar but unrelated results
– doesn't account for non-content signals of similarity (recency, popularity)
3/4
I'm using GPT5 Pro to find me the best stocks and startup investments.
Asked it to use modern portfolio theory and size investments.
—Top Privates [+9.7%]: Databricks, Stripe, Anthropic, SpaceX
—Top Publics [+14.2%]: Nvidia, TSMC, Microsoft, Meta
Just put $1000 into the stocks!
Prompt: "Check all public / private stock market companies and tell me what I should invest in from first principles reasoning. You have $1000.
Please do deep research and present rationale for each investment. Each one should have a target price and expected value. Use advanced math for trading. Draw research from authoritative sources like research and unbiased pundits. Size my bets properly and use everything you know about portfolio theory. Corroborate each decision with a list of predictions about those companies.
Your goal is to maximize expected value. Make minimum 5 investments. Write it in a table."
This follows my previous experiment on Polymarket, which seemingly had ~2-4x the expected returns!
And yes, I know they’ve always reported on the 477 denominator, but that’s NOT “SWE-Bench verified”, that’s an entirely different metric, it’s “OpenAI’s subset of SWE Bench Verified” and that number can’t be compared
Microsoft just leaked their official compensation bands for engineers.
We often forget that you can be a stable high-performing engineer with
great work-life balance, be a BigTech lifer and comfortably retire with a net worth of ~$15M!