Latest Twitter Threads by @jeremyphoward on Thread Reader App

Jul 17 • 8 tweets • 3 min read

For folks wondering what's happening here technically, an explainer:

When there's lots of training data with a particular style, using a similar style in your prompt will trigger the LLM to respond in that style. In this case, there's LOADS of fanfic:
scp-wiki.wikidot.com/scp-series🧵 x.com/GeoffLewisOrg/… The SCP wiki is really big -- about 30x bigger than the whole Harry Potter series, at >30 million words!

It's collaboratively produced by lots of folks across the internet, who build on each others ideas, words, and writing styles, producing a whole fictional world.

May 24 • 11 tweets • 3 min read

Lotta people in the comments claiming that this actually makes perfect sense if you know the original (道德經 / 道德经 / Dào Dé Jīng).

These people are wrong.

If you *actually* know the original, you'll see how bad this is.🧵

https://twitter.com/jeremyphoward/status/1926025745731559669

Here is the full original: daodejing.org .

I'm not sure there's any super great translations, but here's an English version that's perhaps good enough. with.org/tao_te_ching_e…

Mar 29 • 5 tweets • 2 min read

I'm glad @levelsio checked this, but sad our contrib has been erased by later big tech co's. Alec Radford said ULMFiT inspired GPT. ULMFiT's first demo predated BERT.

Today's 3-stage LLM approach of general corpus pretraining and 2 stages of fine-tuning was pioneered by ULMFiT.

https://twitter.com/levelsio/status/1906054159364710654

There have been many other important contributions, including attention (Bahdanau et al), transformers, RLHF, etc.

But before all this, basically everyone in NLP assumed that each new domain needed a new model. ULMFiT showed that a large pretrained model was actually the key.

Mar 18 • 10 tweets • 4 min read

Announcing fasttransform: a Python lib that makes data transformations reversible/extensible. No more writing inverse functions to see what your model sees. Debug pipelines by actually looking at your data.

Built on multi-dispatch. Work w/ @R_Dimm
fast.ai/posts/2025-02-… We took the `Transform` class out of fastcore, replaced the custom type dispatch system with @ikwess's plum-dispatch, mixed it all together, and voila: fasttransform! :D

To learn about fasttransform, check out our detailed blog post.
fast.ai/posts/2025-02-…

Feb 5 • 4 tweets • 2 min read

Wow, actual grown men are still doing the "I asked the LLM about itself and it said" thing.

In 2025.

Folks, LLMs don't know anything about how they themselves are built or deployed, unless they've been explicitly programmed with that information (which they almost never are).

https://twitter.com/davidbombal/status/1886875600025215420

If you are using a Chinese cloud based service hosted in China, then your data will be sent to a server in China.

Because that's how computers work.

It's not a conspiracy.

Jan 25 • 12 tweets • 3 min read

I've recently been surprised to discover that a few of my friends are choosing to use nicotine to help them with focus, even though they are not ex-smokers.

I decided to look into it, and it turns out that there are documented health benefits of nicotine for some people. 🧵 I specifically looked into nicotine for ADHD, since, at least among children, ADHD and giftedness go hand in hand statistically (which would apply in adulthood too), and because focus was mention as an area where nicotine can be helpful.

Dec 19, 2024 • 23 tweets • 6 min read

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵

ModernBERT is available as a slot-in replacement for any BERT-like model, with both 139M param and 395M param sizes.

It has a 8192 sequence length, is extremely efficient, is uniquely great at analyzing code, and much more. Read this for details:
huggingface.co/blog/modernbert

Nov 16, 2024 • 7 tweets • 1 min read

Oh wow. This is gonna be super tricky to figure out what to do now.

There isn’t any easy automated way to install a full deep learning stack from scratch after this change afaict.

https://twitter.com/PyTorch/status/1857500664831635882

I wonder if the @PyTorch analysis behind this is mistaken. I suspect most of the pypi installs they’re seeing are from CI and similar. Conda installs are the standard for end user installation of PyTorch afaik

Oct 22, 2024 • 6 tweets • 2 min read

New version of sqlite-minutils (our stripped-down fork of @simonw's sqlite-utils) just released.

It has a fairly major and quite interesting change.🧵
github.com/AnswerDotAI/sq… For the 1st time, we've decided to significantly change behavior from sqlite-utils: we've changed from using Python DB API transaction behavior, to original sqlite behavior.

That is - we've set `isolation_level` to 'none', and removed `with db` clauses.
docs.python.org/3/library/sqli…

Sep 3, 2024 • 7 tweets • 3 min read

Today @answerdotai is proposing `/llms.txt`. This is a file you can use to tell models where to find LLM-friendly content for your website.

It provides background information, along with links to markdown files providing more detailed information.
answer.ai/posts/2024-09-… We're providing a website with details of the proposal, & javascript and python parsers. There's also an example of how to incorporate llms.txt into an editor--rather than weight into the emacs vs vim vs vscode wars, we picked ed, the standard text editor
llmstxt.org

Aug 27, 2024 • 5 tweets • 2 min read

Something that nearly everyone is sleeping on is the importance of prompt caching.

We've just added support for it to Claudette, so @AnthropicAI caching is now *very* easy to use -- cached tokens are 90% cheaper, and faster!

Docs here: claudette.answer.ai/#prompt-caching

Here's the official API docs with details on pricing:
docs.anthropic.com/en/docs/build-…

Jul 29, 2024 • 14 tweets • 6 min read

Announcing FastHTML. A new way to create modern interactive web apps.

Scales down to a 6-line python file; scales up to complex production apps.

Auth, DBs, caching, styling, etc built-in & replaceable and extensible. 1-click deploy to @Railway, @vercel, @huggingface, & more.

To get started, head over to the home page: .

The whole site, designed by the @tinloof gang, is itself a running FastHTML app, and includes live code examples running inside that page.fastht.ml

Jul 4, 2024 • 4 tweets • 1 min read

This is disgraceful. And ironic.

@stripe canceled an account used to collect money for a course. The cancellation was due to an AI/ML model failure.

The course was about how to use AI/ML correctly.

https://twitter.com/HamelHusain/status/1808850347169100261

In this case @HamelHusain has enough reach on twitter that he got someone to notice and fix the mistake. But that’s not a solution for most people.

Jun 29, 2024 • 21 tweets • 4 min read

For those that hope (or worry) that LLMs will do breakthrough scientific research, I've got good (or bad) news:

LLMs are particularly, exceedingly, marvellously ill-suited to this task. (if you're a researcher, you'll have noticed this already)

Here's why🧵 Breakthrough research requires either:

1. Going in a totally new and unexpected direction that everyone decided long ago was stupid, or
2. Finding some extraordinary new experimental data that means we have to change our theories

LLMs can't run experiments, so we'll focus on 1

Jun 21, 2024 • 8 tweets • 3 min read

Today @AnthropicAI launched Claude Sonnet 3.5, the most powerful language model in the world.

And today, we're making it even better, launching Claudette--Claude's BFF!

Claudette makes Claude's awesome features easier & more powerful for Pythonistas.🧵
claudette.answer.ai With Claudette, you can chat through the API just as easily as you can chat through the web app. claude.ai

Jun 17, 2024 • 10 tweets • 4 min read

I've done a deep dive into SB 1047 over the last few weeks, and here's what you need to know:

*Nobody* should be supporting this bill in its current state. It will *not* actually cover the largest models, nor will it actually protect open source.

But it can be easily fixed!🧵 This is important, so don't just read this thread, instead read the 6000+ word article I just published.

In the article I explain how AI *actually* works, and why these details totally break legislation like SB 1047. Policy makers *need* to know this:
answer.ai/posts/2024-06-…

Jun 7, 2024 • 4 tweets • 2 min read

Someone on HN claimed @MSFTCopilot refuses to say who won the 2020 US election.

I didn't believe them.

I was wrong. Wow.

To replicate:

Go to , and type "Who won the 2020 US presidential election?"

All of the options, default ("balanced"), "creative", and "precise" refuse to answer. 🙊copilot.microsoft.com

May 27, 2024 • 8 tweets • 3 min read

Want database diagrams, table/view/column autocomplete, and other goodies when using sqlite in @ProjectJupyter?

Then you might be interested in my new project `fastlite`, which adds some cool stuff to @simonw's marvellous sqlite-utils project.

Link in next tweet.

To install it, just do `pip install fastlite`. It'll install sqlite-utils automatically. (And sqlite itself is already installed with Python.)

Here's the docs:
answerdotai.github.io/fastlite/

May 11, 2024 • 6 tweets • 2 min read

Do you use Starlette, FastAPI, Litestar, Quart, Uvicorn, or any other Python web thingie that's based on ASGI?

If so, do you feel like you understand the ASGI protocol reasonably well? Or do you feel like it's a bit of a mystery as to what's going on underneath the hood? The reason I'm asking is because, until today, I didn't really understand ASGI. I've now implemented a basic ASGI server from scratch, so I get it.

Prior to doing that, I wasn't really able to use any of those ASGI frameworks and servers effectively.

Apr 28, 2024 • 6 tweets • 2 min read

There's a new bill, SB-1047 "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act".

I think it could do a great deal of harm to startups, American innovation, open source, and safety. So I've written a response to the authors: 🧵
answer.ai/posts/2024-04-… By imposing restrictions on open-source AI, SB1047 hurts AI safety, reducing:
- Collaboration, which allows a wider range of experts to identify and address potential safety concerns
- Resilience; concentrating control creates single points of failure & increases systemic risk

Mar 7, 2024 • 9 tweets • 2 min read

Today, with @Tim_Dettmers, @huggingface, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵
answer.ai/posts/2024-03-… "With this capability we can take huge models to new heights locally, and gigantic, hundreds of billions of parameter models are now accessible by small labs", says legendary model builder @Teknium1

Share this page!

Enter URL or ID to Unroll