Jeremy Howard Profile picture
🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; https://t.co/16UBFTX7mo ; I don't check DMs or mentions, sorry!
24 subscribers
Dec 19 23 tweets 6 min read
I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵 Image ModernBERT is available as a slot-in replacement for any BERT-like model, with both 139M param and 395M param sizes.

It has a 8192 sequence length, is extremely efficient, is uniquely great at analyzing code, and much more. Read this for details:
huggingface.co/blog/modernbert
Nov 16 7 tweets 1 min read
Oh wow. This is gonna be super tricky to figure out what to do now.

There isn’t any easy automated way to install a full deep learning stack from scratch after this change afaict. I wonder if the @PyTorch analysis behind this is mistaken. I suspect most of the pypi installs they’re seeing are from CI and similar. Conda installs are the standard for end user installation of PyTorch afaik
Oct 22 6 tweets 2 min read
New version of sqlite-minutils (our stripped-down fork of @simonw's sqlite-utils) just released.

It has a fairly major and quite interesting change.🧵
github.com/AnswerDotAI/sq… For the 1st time, we've decided to significantly change behavior from sqlite-utils: we've changed from using Python DB API transaction behavior, to original sqlite behavior.

That is - we've set `isolation_level` to 'none', and removed `with db` clauses.
docs.python.org/3/library/sqli…
Sep 3 7 tweets 3 min read
Today @answerdotai is proposing `/llms.txt`. This is a file you can use to tell models where to find LLM-friendly content for your website.

It provides background information, along with links to markdown files providing more detailed information.
answer.ai/posts/2024-09-… We're providing a website with details of the proposal, & javascript and python parsers. There's also an example of how to incorporate llms.txt into an editor--rather than weight into the emacs vs vim vs vscode wars, we picked ed, the standard text editor
llmstxt.org
Aug 27 5 tweets 2 min read
Something that nearly everyone is sleeping on is the importance of prompt caching.

We've just added support for it to Claudette, so @AnthropicAI caching is now *very* easy to use -- cached tokens are 90% cheaper, and faster!

Docs here: claudette.answer.ai/#prompt-caching
Image Here's the official API docs with details on pricing:
docs.anthropic.com/en/docs/build-…
Jul 29 14 tweets 6 min read
Announcing FastHTML. A new way to create modern interactive web apps.

Scales down to a 6-line python file; scales up to complex production apps.

Auth, DBs, caching, styling, etc built-in & replaceable and extensible. 1-click deploy to @Railway, @vercel, @huggingface, & more. Image To get started, head over to the home page: .

The whole site, designed by the @tinloof gang, is itself a running FastHTML app, and includes live code examples running inside that page.fastht.ml
Jul 4 4 tweets 1 min read
This is disgraceful. And ironic.

@stripe canceled an account used to collect money for a course. The cancellation was due to an AI/ML model failure.

The course was about how to use AI/ML correctly. In this case @HamelHusain has enough reach on twitter that he got someone to notice and fix the mistake. But that’s not a solution for most people.
Jun 29 21 tweets 4 min read
For those that hope (or worry) that LLMs will do breakthrough scientific research, I've got good (or bad) news:

LLMs are particularly, exceedingly, marvellously ill-suited to this task. (if you're a researcher, you'll have noticed this already)

Here's why🧵 Breakthrough research requires either:

1. Going in a totally new and unexpected direction that everyone decided long ago was stupid, or
2. Finding some extraordinary new experimental data that means we have to change our theories

LLMs can't run experiments, so we'll focus on 1
Jun 21 8 tweets 3 min read
Today @AnthropicAI launched Claude Sonnet 3.5, the most powerful language model in the world.

And today, we're making it even better, launching Claudette--Claude's BFF!

Claudette makes Claude's awesome features easier & more powerful for Pythonistas.🧵
claudette.answer.ai With Claudette, you can chat through the API just as easily as you can chat through the web app. claude.ai
Image
Jun 17 10 tweets 4 min read
I've done a deep dive into SB 1047 over the last few weeks, and here's what you need to know:

*Nobody* should be supporting this bill in its current state. It will *not* actually cover the largest models, nor will it actually protect open source.

But it can be easily fixed!🧵 This is important, so don't just read this thread, instead read the 6000+ word article I just published.

In the article I explain how AI *actually* works, and why these details totally break legislation like SB 1047. Policy makers *need* to know this:
answer.ai/posts/2024-06-…
Jun 7 4 tweets 2 min read
Someone on HN claimed @MSFTCopilot refuses to say who won the 2020 US election.

I didn't believe them.

I was wrong. Wow. Image To replicate:

Go to , and type "Who won the 2020 US presidential election?"

All of the options, default ("balanced"), "creative", and "precise" refuse to answer. 🙊copilot.microsoft.com
May 27 8 tweets 3 min read
Want database diagrams, table/view/column autocomplete, and other goodies when using sqlite in @ProjectJupyter?

Then you might be interested in my new project `fastlite`, which adds some cool stuff to @simonw's marvellous sqlite-utils project.

Link in next tweet. Image To install it, just do `pip install fastlite`. It'll install sqlite-utils automatically. (And sqlite itself is already installed with Python.)

Here's the docs:
answerdotai.github.io/fastlite/
May 11 6 tweets 2 min read
Do you use Starlette, FastAPI, Litestar, Quart, Uvicorn, or any other Python web thingie that's based on ASGI?

If so, do you feel like you understand the ASGI protocol reasonably well? Or do you feel like it's a bit of a mystery as to what's going on underneath the hood? The reason I'm asking is because, until today, I didn't really understand ASGI. I've now implemented a basic ASGI server from scratch, so I get it.

Prior to doing that, I wasn't really able to use any of those ASGI frameworks and servers effectively.
Apr 28 6 tweets 2 min read
There's a new bill, SB-1047 "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act".

I think it could do a great deal of harm to startups, American innovation, open source, and safety. So I've written a response to the authors: 🧵
answer.ai/posts/2024-04-… By imposing restrictions on open-source AI, SB1047 hurts AI safety, reducing:
- Collaboration, which allows a wider range of experts to identify and address potential safety concerns
- Resilience; concentrating control creates single points of failure & increases systemic risk
Mar 7 9 tweets 2 min read
Today, with @Tim_Dettmers, @huggingface, & @mobius_labs, we're releasing FSDP/QLoRA, a new project that lets you efficiently train very large (70b) models on a home computer with consumer gaming GPUs. 1/🧵
answer.ai/posts/2024-03-… "With this capability we can take huge models to new heights locally, and gigantic, hundreds of billions of parameter models are now accessible by small labs", says legendary model builder @Teknium1
Feb 7 5 tweets 2 min read
Currying and composition in a nutshell (with APL). Image (This is easier for primary school children to learn than many things they are taught. At least according to the primary school kids I've taught it to.)
Jan 26 14 tweets 6 min read
There are few things more important to our civilization than understanding how to better do R&D. Thankfully, @eric_is_weird has dedicated himself to studying this question.

As a result, he's become the foremost scholar and historian of 19th and 20th century R&D labs.
1/🧵 Image We are incredibly lucky that @eric_is_weird has taken a strong interest in , and decided to do a deep dive into our organizational structure and R&D approach.

His article is a fascinating exploration of the last 2 centuries of R&D:
Answer.AI
answer.ai/posts/2024-01-…
Dec 11, 2023 6 tweets 2 min read
This is rather long, and I haven't checked it, but in short the claims here are use of complex GPT4 prompting to achieve:

- 100% on ConceptARC (which is a really difficult task that previously hasn't been cracked)
- A chess engine that beats all other chess engines. Examples are provided (but they need to be run directly on GPT4 API with temperature 0) so you can check the claims.
Nov 18, 2023 11 tweets 3 min read
OK everyone's asking me for my take on the OpenAI stuff, so here it is. I have a strong feeling about what's going on, but no internal info so this is just me talking.

The first point to make is that the Dev Day was (IMO) an absolute embarrassment. I could barely watch the keynote. It was just another bland corp-speak bunch of product updates.

For those researchers I know that were involved from the beginning, this must have felt nausea-inducing.

The plan was AGI, lifting society to a new level. We got Laundry Buddy. Image
Oct 13, 2023 9 tweets 4 min read
If you're like me and find it easier to read *code* than *math*, and you have access to @OpenAI GPT 4V (or use @bing or @google Bard), try pasting a image of an equation you wanna understand in there.

It might just blow your mind.
1/🧵 Image Multiple equations? No problem!

Image
Image
Image
Oct 13, 2023 9 tweets 4 min read
If you're like me and find it easier to read math than code, and you have access to @OpenAI GPT 4V, try pasting a image of an equation you wanna understand in there.

It might just blow your mind. Image Multiple equations? No problem!

Image
Image
Image