Latest Twitter Threads by @varun_mathur on Thread Reader App

Oct 19, 2023 • 11 tweets • 13 min read

🌱 Wikipedia-style Community LLMs

tl;dr: simple experiment showcasing why millions of Wikipedia-style community LLMs will one day provide a better user experience than the biggest closed AI startups for most use cases. it is hopeless to compete against "us"..

We hear a lot about new and powerful LLMs every day, but what about the user experience ? Where do you, as a user, really get the best results with the least amount of effort ?

It turns out, the answer is in collaboration. These word calculators are a lot more useful when we use them together with each other. Think Wikipedia. To test this intuition out, we build out a tool called Collama and created a repo for Ethereum protocol research. In that repo, we added multiple people, who together added over 1400 of the most credible sources about Ethereum: Vitalik's essays, relevant whitepapers, highly credible Twitter threads, articles written by top crypto funds and research organizations, relevant code and specs, amongst many others.

And then, we put it to the test, hooking this repo up to a model. You can try it out here:

And then, we compared it to both Perplexity and ChatGPT on 10 questions. For every single question here, this experiment shows that this collaborative approach which curated data carefully had higher quality and more credible responses. Every single time.

It turns out that People + LLMs = 🔥

What just happened here..
It took just over one day to curate the Collama repo here with the credible sources, and a few weeks to build the product before that. We had no business getting better results here! We have no GPUs, no AI experts..

The results, in the following thread, showcase that ChatGPT comes across as demoware in these examples, while Perplexity, given it's use of the top few results returned by Google, gives more useful answers. However, an Ethereum-focused repo on Collama, with it's focus on curated highly credible sources, is where you would actually want to do your Ethereum protocol research.

As it turns out, the just right consumer experience seeks to get everything right: from the base LLM, to how data is curated, to how it is prompted, to how free and limitless it is to use.

Building an end-to-end consumer product, not just software
The world of portalization of LLMs where the one all-knowing base model is used without context doesn't actually enable the best UX for all use cases. Beyond the generic usage, there is going to be another world of AI usage: fragmented, a world of millions of domain-specific, community LLMs, which go through finetuning of base powerful open models based on community gathered vectors, prompts and responses.

The data needs to be continuously nurtured like a plant, with the right automated tools being made available.

From a user perspective, the LLM is not the product, but the total experience using it is, which is a function of how it is prompted, how fast it is, and how cheap and limitless it is. Before Wikipedia originated, the predecessor company behind it tried to create all the high quality pages itself (NuPedia). That didn't work out. The seemingly messy system of reputation and collaboration which developed over a period of time enabled Wikipedia to grow and thrive. It is the most remarkable example of human social co-ordination even in adverse conditions, and we are going to need and see something similar for how LLMs get used at-scale. People will not settle for a slightly worse-off user experience, and they will also not settle for their hard work curating data being locked away by any one company. Remember - it is a continuous service. There are no moats, except the service which users would trust with their effort in curating their data.

"It is hopeless to compete against us", where the us is the rest of us, chipping away at improving our community LLMs from our work places, college campuses and coffee shops..

Thinking beyond Google
Google search results as information sources for a model are not good enough. As this experiment shows, there are more credible sources of information which Google is missing, and SEO-optimization practices over the years mean the most credible chunks of information don't always surface to the top few results. As a search engine, the top few results being generally good enough works; but for knowledge synthesis - you need the absolute best vectors to earn the user's trust.

For eg - in a question about EigenLayer, while ChatGPT does not even know what it is, Perplexity doesn't even reference the whitepaper or it's founder's detailed thread, because Google doesn't return it in the top few results. This is why no one can build the best LLM experience based on just Google search results as vectors. We need the next order of innovation, and we need to view the web in a fundamentally different way. AI is a higher-order abstraction, and we are in an era of transition from the human-readable web to the machine web. The fundamental product we are working on is called VectorRank™️ to augment Wikipedia-style data curation, which can benefit all LLMs, everywhere.

What's next
Think Wikipedia-style community LLMs powered by a BitTorrent-like consumer network..

The absolute best user experience happens only with an extremely open and decentralized approach to AI.

Here we demo-ed how atleast for one community, the open approach provides a better AI experience than the top pre-existing products. In the days ahead, we will open up our experimental peer-to-peer inference network which can run Mistral7B and other models on your laptops and desktops, in combination with this web interface, so AI usage increasingly becomes cheaper as well. Imagine being able to provide model responses to others in your college campus and earning community reputation for it.

The day is not far - you are not going to need a $20/month subscription for a professional AI service..that's not how Google achieved it's scale, and that's not how the benefits of AI will get delivered billions of times a day. Infact, even before Google there existed a paid search engine called Hoover. Extending the idea of the product with the just right horizontal scaled infrastructure, and the just right business model made search abundant at the time.

Our mission at Hyperspace is simple: to make AI abundant. What is the best experience, and how can it be free and limitless, billions of times a day ?

PS: Collama is a simple RAG implementation inspired from many existing UIs, where each repo will eventually run against it's own finetuned model with a Wikipedia-like reputation system. It's a garbage-in-garbage-out system: if a repo doesn't have sufficient data, then the default is simply using the top Google search results (which we will add as well, or I recommend using Perplexity instead). Today you can create your own collamas, add prompt-threads which others can see, favorite and fork them. If you care about a topic - grow the community around it! We will not lock your carefully curated data over a period of time behind our own service only - instead, we are taking an open protocol approach, where you will be able to participate in hosting your community LLM's data if you want. These are very early days, and we have just over 100,000 "passage" vectors in the system and close to 1,000 prompts which have been run, but we intend to scale to serving billions of highly ranked vectors to millions of community LLMs running on your own devices.collama.ai/varun/ethereum

"What is EigenLayer and what are it's top 5 use cases per it's founder ?"

ChatGPT has no idea what EigenLayer is. Perplexity gives a good answer, while Collama gives the best answer as it specifically uses the thread about EigenLayer use cases by it's founder.

Jan 29, 2022 • 9 tweets • 2 min read

Far out predictions are generally extremely ambitious while immediate-future predictions are deeply skeptical.

How about stuff in the near-future ? What could be the Black Swan events of the 6 month to 18 month future ? BS#1:

Twitter has a < $30B market cap. It's the prime source of "alpha", thus a major financial information player like Bloomberg or some hedge fund group could attempt a reasonable bid ? But what if Amazon/Google/Microsoft go for it instead ? How does that change our world ?

Jan 28, 2022 • 7 tweets • 2 min read

What comes after GMail ?

As a sender, I need and want to have full control over what I “send”. Compose text, and can start/update/stop “streaming” email. Add people as viewers even after the message goes live.

As a receiver, I want to do zero additional things to see an email. As a receiver, I want to be able to see if I have seen the latest version and what is the diff if any.

As a receiver, I can fork that email stream and add my own set of recipients, if the original sender permissions allow, with a permission message sent to the original sender.

Jan 28, 2022 • 10 tweets • 3 min read

Thinking about decentralization across three levels:

1. Philosophy

2. Technical

3. UX 1. Philosophy of decentralization

Peer-to-peer is inherently the most efficient and the most fair state of a system. It’s empowering and reduces cost (of all kinds) by reducing inefficiencies.

Assume all systems get there - it’s a question of when and how, not if.

https://twitter.com/cdixon/status/965268012621217792

Jan 26, 2022 • 6 tweets • 2 min read

Utility blockchains are state machines, not datastores.

As chains get bigger, older transactions/proofs (eg >6month-1yr) will end up far away from a blockchain and on to a ‘boring’ distributed database.

Eg your NFT txn in 2020 will not be on either Solana or Ethereum by 2023 So this is like verifying the purity of a water bottle, and then putting a sticker on it. After a while you move it to permanent storage.

A year later you return to check how things are - at this point you now need to trust how that database serving the info is maintained.

Jan 13, 2022 • 90 tweets • 32 min read

Introducing Hyperspace.gl

What if there existed a new type of app store, governed by developers and users, which integrated a fast, free and simple enough blockchain ideal for daily city life ?

Why we are building the 'MacOS of the Web3' era:

👋 Before we get started, a quick bio:

🎬 Jared Leto's team picked up my '2025 thread'
🧘‍♂️ Satoshi - the sun will shine..
🌎 Hyperspace saw usage in 619 cities last year

This is my 2nd web3 venture, 2nd attempt at a 'WebOS': varunmathur.xyz

It will work this time🤞

Aug 29, 2021 • 12 tweets • 2 min read

The value of that NFT rock is not $1.3 million, but neither it is $0.

The simple news that a boring JPEG can have a non-zero value, that triggers the cycle and thinking for a lot more objects.

This leads to: NFT as an asset class will be worth over a $1 trillion this decade. Art has value because it is not inherently replicable. It is, proof of work.

For the replicable image, the proof of work is only in what it cost to include it in the blockchain.

It’s not the rock image which has value, but the token.

Mar 21, 2021 • 17 tweets • 3 min read

The Art of War

It is the “un-lean startup” book. Translating it for the startup world.

Thread..

1. “Laying plans”

You have to study the competitive landscape closely - can’t just rush into it.

Be able to forecast victory or defeat.

Have plans, but be adaptive.

Mar 20, 2021 • 8 tweets • 2 min read

Some recent threads I wrote..

https://twitter.com/varun_mathur/status/1373308489842458625

https://twitter.com/varun_mathur/status/1373363269138780162

Mar 20, 2021 • 9 tweets • 2 min read

Most preposterous but plausible consumer tech alternate timeline:

Twitter built out a deep and rich public social graph. It became our primary “inbox”, replaced both email and LinkedIn for most interactions.

And then made that graph available to startups with revenue sharing. Zoom re-invented itself as a consumer company and built out an even simpler and delightful interface.

That required making tough changes to a successful product, but they are now the default API of how the world interacts.

Mar 20, 2021 • 23 tweets • 5 min read

A game theory approach for how to choose and work at startups. There are two key things to do here:

1. Choosing the right startup to work for

2. When you do, to give it your absolute best

Thread..

A startup is an extremely long job interview until the point stock has vested, is worth something and gives a financial return.

If you get fired or leave before that point, then all you have earned is a salary, which means you didn't choose properly or didn't work smart enough.

Mar 18, 2021 • 12 tweets • 3 min read

“Embarassing startups”

Its 1994. You buy your books from Books/com. They have 400k titles.

Your friend is into a company called Cadabra, where their team plans to learn the book business and then make a book selling site.

You feel sorry for your friend.

(Amazon) It’s the Labor Day weekend of 1995.

Everybody is working on a cool new e-commerce website. Your friend spent his evenings and weekends coding an auction website which nobody wanted. It has 0 users on day 1.

You feel sorry for your friend.

(eBay)

Feb 22, 2021 • 27 tweets • 12 min read

Recently I stumbled across a curious tweet from 2009, which upon further analysis, convinced me that it was from an anonymous Satoshi Nakamoto account.

This is Satoshi -> @fafcffacfff

Pre-Bitcoin announcement era. Thread (also, non-paywalled post)👇

offthetrack.substack.com/p/satoshis-ano… Satoshi was an extremely chatty person, until the point "he" decided to go quiet in Dec 2010.

This tweet was written *one day before* the publication of the Bitcoin whitepaper on Oct 31st, 2008 and contained language similar to that used in later emails:

https://twitter.com/fafcffacfff/status/982620972

Feb 20, 2021 • 5 tweets • 1 min read

You build a great new product X.

Big tech then copies it feature by feature, what do you do ?

-> Help improve big tech’s copied product experience by providing some service Y.

Why ? Thread.. If you invent habit X and bring it to 10 people, and big tech then copies it and brings it to 10 million people, now those 10 million have developed that new habit X, and will eventually seek the best product in it’s category.

May 2, 2020 • 70 tweets • 15 min read

I am a time-traveller from 2025. I lived through the events of 2020 which ruptured the fabric of society as we knew it then. As humans, we adapted, and survived.

Here is what my world looks like now. This is my new normal.

A thread.. Firstly, we still don't have a COVID-19 vaccine available to billions of us worldwide. Key people in society and government were the first ones to get it, and rest of us are waiting.

So while we wait, society has adapted to prevent COVID-19 from spreading rapidly ever again.

Feb 22, 2020 • 7 tweets • 3 min read

It's the summer of '99.

Google is growing rapidly but still has less than 1% share of the daily search market, is not focused on any niche, and is also not getting mentioned in the press. It is growing based on word of mouth alone and raises $25MM.

Pets .com raises $50MM🤷‍♂️

Google originated as "BackRub" in Stanford and in the fall of '96 was generating buzz in the academic research community.

By fall of '98, it had 10k queries/day, and raised $1MM in funding from Bezos and 3 other angel investors.

Salon wrote this then:
salon.com/test/1998/12/2…

Jan 3, 2020 • 24 tweets • 9 min read

What is the story of Twitter ? How do you build breakthrough, mission-driven consumer products ? What is the consistent pattern from Twitter to Airbnb ?

=> Sociology-driven thesis, tapping into a deep human connective tissue + repeated attempts to keep spinning the flywheel

Jack, who had dropped out of NYU and was living in a tough neighborhood in Oakland (SF Bay Area), came up with the core idea in 2000. He was a frequent user of LiveJournal, observed people using it to share their "status", and thought what if that existed as a standalone service.

Dec 22, 2019 • 16 tweets • 9 min read

It’s the summer of 2010. Many people in the SF Bay Area (+ elsewhere) are reaching the conclusion that photo-based apps would be a very big deal in this decade, and they were right.

However, why did Instagram win, over much better funded rivals Color and PicPlz ? A $100B story.. Lets back up to March 2010 first. Kevin Systrom had raised $500k seed for an app called Burbn, from a16z and Baseline Ventures. It was supposed to be a location-based startup but wasn't quite adding up for Kevin.

He started developing Instagram instead..
instagram.com/p/G/

Sep 2, 2019 • 8 tweets • 4 min read

6 months before UberCab launched, 2009 saw the launch of Cabulous. It had icons of moving cabs on a map and also the ability to e-hail a cab.

It didn’t close the loop on the UX though (no payments), raised too little (<$1MM), had a wide city-wide focus and declined Bill Gurley! Here is Gurley suggesting a rebrand as “Limo Magic” to go into black cars + his $8 million investment offer, one year before UberCab launched with limos/black cars.

Amazing to see multiple people converging on a similar solution to a problem.

Source: “Upstarts”

cc @IlyaAbyzov

Jul 9, 2019 • 15 tweets • 6 min read

/make hay

Early adopter markets are now so big and fluid that products can grow very fast in them, ventures get massive valuations and successful financial exits happen, all without actually having to cross the chasm, which was the failure point of startups in earlier years

The number of monthly paying subscribers to either Spotify or Netflix (~100MM) is higher than the total number of Internet users since the dot com boom 20 yrs ago!

3MM-5MM are hosts on Airbnb, or driver-partners for Uber, Lyft or DoorDash. Slack has 10MM DAUs

World of 4B+ users

Apr 10, 2019 • 12 tweets • 3 min read

Upcoming tech IPOs have one thing in common: none of these originated from “lean startup”.

They were deliberate efforts, toiling away in obscurity for several years and then launching with well-formed products focused on design and user experience.

LYFT, PINT, UBER, SLAK, ABNB

Summarized my key thoughts from this thread in this link along with the book recommendations and other references: medium.com/@startuphacker…

Thanks all for the feedback.

Share this page!

Enter URL or ID to Unroll