Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Linus

@thesephist

Jul 8, 2021 • 8 tweets • 4 min read • Read on X

Scrolly

NEW PROJECT — I made a "personal search engine" that lets me search all my blogs, tweets, journals, notes, contacts, & more at once 🚀

It's called Monocle, and features a full text search system written in Ink 👇

GitHub ⌨️ github.com/thesephist/mon…
Demo 🔍 monocle.surge.sh

One of my goals for this project was to learn about full text search systems, and how a basic FTS engine worked. So I wrote a FTS engine in Ink.

The project's readme goes into a little detail about how each step works, and how it all fits together.

📖 github.com/thesephist/mon…

The more I've been using it (since Saturday, when I had an MVP), the more I realize that this kind of a tool is probably my best shot at building a Memex, a system that knows about and lets me search through my entire landscape of knowledge — theatlantic.com/magazine/archi…

I've probably performed ~100 searches for various names, ideas, memories, blogs, and other random things in the last week, and the most interesting thing is how searching for one thing helps me stumble into some unexpected insight or memory from my past. Creative randomness.

Lastly, great search and recall is a centerpiece of the "incremental note-taking" concept I discussed last week — monocle.surge.sh/?q=incremental…

Monocle is a system that doesn't need me to take notes; it gathers knowledge by looking through my existing digital footprint.

I've spent a bunch of time on this this weekend so probably going to take a small break, but hopefully in the coming weeks and months I'll add a few more data sources to my search index:

- Browser history, YouTube watch history
- Reading list from Pocket
- Email (maybe?)

Lastly, a question I'm definitely expecting is "can I run this on my own data?"

Uhh... .probably not right now? The system is pretty custom-built for my setup. But if I like it, I might make a version that's open for other people to try ✌️

https://twitter.com/thesephist/status/1413319997401804804

Wrote up some more thoughts in a blog :)

https://twitter.com/thesephist/status/1413319997401804804

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @thesephist

Linus

@thesephist

Mar 1

Hypothesis: information work is overwhelmingly bottlenecked on availability of high-signal context more than by correct inference over the context. If right, implies higher ROI-per-flop of context building over pure logical inference.

h/t @anandnk24

Also, virtually all of the valuable context is in the tail of the information distribution. h/t @paraga

@anandnk24 wow this blew up. i dont have a soundcloud but go check out @PatronusAI to make sure your AI systems are behaving correctly and go make some ai apps on @ValDotTown

Read 4 tweets

Linus

@thesephist

Aug 11, 2023

had a chance last night to meet with some of the best minds in AI to discuss the most pressing challenge facing society today:

✨how to afford attending the ERAS TOUR ✨

after much discussion, we've arrived at a breakthrough, what we've termed the "Taylor Swift Scaling Laws" 👇

the Taylor Swift Scaling Laws (TS2L) take inspiration from Scaling Laws for transformer-based LLMs, and apply the same log-log regression methodology to model and understand components of Taylor's ticket prices.

dare I say, we may have found something equally impactful

some highlights from the paper, which is a joint (overnight) work between me, @jtvhk, Ashish, and Niki (+ GPT4, thanks Notion for the OpenAI credit)

That last sentence == banger.

Read 5 tweets

Linus

@thesephist

Feb 25, 2023

I built a personal chatbot from my personal corpus[1] a couple weeks ago on fully open-source LMs. On a whim I gave it iMessage.

Didn't expect the iMessage bit to matter, but it made a huge difference in how it feels to interact. Much more natural.

[1] thesephist.com/posts/monocle/

@sendbluedotco

Full write up hopefully coming soon, but I'm using cosmo-xl for text generation with my own prompt, retrieving from an in memory vector DB with sentence-transformers embeddings, and using @sendbluedotco for iMessage.

@sendbluedotco

@sendbluedotco 🤷‍♂️🤔

Read 4 tweets

Linus

@thesephist

Nov 16, 2022

Small rant about LLMs and how I see them being put, rather thoughtlessly IMO, into productivity tools. 📄

TL;DR — Most knowledge work isn't a text-generation task, and your product shouldn't ship an implementation detail of LLMs as the end-user interface

stream.thesephist.com/updates/166861…

The fact that LLMs generate text is not the point. LLMs are cheap, infinitely scalable black boxes to soft human-like reasoning. That's the headline! The text I/O mode is just the API to this reasoning genie. It's a side effect of the training paradigm.

A vanishingly small slice of knowledge work has the shape of text-in-text-out (copywriting/Jasper). The real alpha is not in generating text, but in using this new capability and wrapping it into jobs that have other shapes.

Read 11 tweets

Linus

@thesephist

Nov 2, 2022

NEW DEMO!

Exploring the "length" dimension in the latent space of a language model ✨

By scrubbing up/down across the text, I'm moving this sentence up and down a direction in the embedding space corresponding to text length — producing summaries w/ precise length control (1/n)

Length is one of many attributes that I can control by traversing the latent space of this model — others include style, emotional tone, context...

Here's "adding positivity" 🌈

It's a continuous space, so attributes can all be mixed/dialed more precisely than by rote prompting

More to follow soon on how it works, but in brief:

- Built on a custom LM arch based on T5 checkpoints
- An "attribute direction" is found from unpaired examples of texts w/ and w/o that trait
- Simple vector math in latent space + decoding from the latent gets you this effect.

Read 10 tweets

Linus

@thesephist

Sep 14, 2022

Good tools admit virtuosity — they have low floors and high ceilings, and are open to beginners but support mastery, so that experts can deftly close the gap between their taste and their craft.

Prompt engineering does not admit virtuosity. We need something better.

Tools like Logic, Photoshop, or even the venerable paintbrush can be *mastered*, so that there is no ceiling imposed by the tool for how good you can get to going from image in your mind -> output. Masters of these tools can wield tools as extensions of themselves.

For this to work, the tool has to present a coherent set of abstractions, and predictable behavior about how composing them will change the user's output. Prompt engineering is not predictable, and there are no coherent abstractions. It's all just gut feelings and copy-paste.

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Linus

Try unrolling a thread yourself!

More from @thesephist

Linus

Linus

Linus

Linus

Linus

Linus

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!