Emmanuel Ameisen Profile picture
Mar 4 8 tweets 3 min read Read on X
Claude 3 Opus is great at following multiple complex instructions.

To test it, @ErikSchluntz and I had it take on @karpathy's challenge to transform his 2h13m tokenizer video into a blog post, in ONE prompt, and it just... did it

Here are some details:
First, we grabbed the raw transcript of the video and screenshots taken at 5s intervals.

Then, we chunked the transcript into 24 parts for efficient processing (the whole transcript fits within the context window, so this is merely a speed optimization).
We gave Opus the transcript, video screenshots, as well as two *additional* screenshots:
- One of Andrej's blog to display a visual style to follow
- The top of the notebook @karpathy shared with a writing style example On top, we added lots of instructions (prompt in repo)
Image
Image
Here is a subset of some of what we asked the model, in one prompt (full prompt attached)
- directly write HTML
- filter out irrelevant screenshots
- transcribe the code examples in images if they contain a complete example
- synthesize transcript and image contents into prose Image
@ErikSchluntz and I have read the resulting transcript, and Opus manages to incorporate all of these requests, and produces a great blog post.

The blog post is formatted as asked, with a subset of images selected and captioned Image
It writes code examples, and relates the content of the transcript to the screenshots to provide a coherent narrative.

Overall, the tutorial is readable, clear and much better than anything I've previously gotten out of an LLM. Image
Of course, the model isn't perfect yet!

When looking through the transcript, @ErikSchluntz found some issues and inconsistencies.

Some minor code bugs slipped through, and some of the sections are repetitive (this is partially due to parallel processing).
This was done in one prompt that @zswitten @ErikSchluntz and I wrote.

If you'd like to try to improve it, here is the prompt

And the full blog post github.com/hundredblocks/…
hundredblocks.github.io/transcription_…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Emmanuel Ameisen

Emmanuel Ameisen Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mlpowered

Jan 18, 2023
I just finished watching @karpathy's let's build GPT lecture, and I think it might be the best in the zero-to-hero series so far.

Here are eight insights about transformers that the video did a great job explaining.

Watch the video for more.



(1/9)
1. Transformers as sum of attention blocks

A transformer is mostly a stack of attention blocks. These work similarly in encoders and decoders (see difference below). Each attention block contains multiple heads, allowing each head to attend to different types of information.
2. Encoder vs decoder transformers

What's the difference between encoders and decoders in transformers?

Encoders use all the information in the input to produce their output.

Decoders use only information from older tokens to predict the next token.
Read 9 tweets
Oct 19, 2022
Most ML folks I know have @AnthropicAI's Toy Models of Superposition paper on their reading list, but too few have read it.

It is one of the most interesting interpretability paper I've read in a while and it can benefit anyone using deep learning.

Here are my takeaways! Image
1/ Feature superposition

The simplest way to represent useful input features in a hidden layer is by using one neuron per feature.

But what happens if you have more useful features than neurons?

How do you compress your features to fit them using fewer neurons? Image
How do neurons represent multiple features?

Well, each neuron represents a direction in vector space

So N hidden neurons can represent N features independently in N orthogonal directions

But they can also represent additional features non-orthogonally! Image
Read 11 tweets
Oct 2, 2018
Just finished Ethics and Data Science, by @mikeloukides, @hmason, @dpatil on @OReillyMedia. Would highly recommend it as it covers a complex from a broad view and with applied tips. Many great takeaways. I've summarized a few below (amazon.com/Ethics-Data-Sc…)
Consent requires clarity: It is required to ask users for the ability to use their data, but these requests are often vague and thus lead to breaches of trust. If I agree to give my data to Facebook for security, can it use it for anything? (techcrunch.com/2018/09/27/yes…)
Ethics should be taught hand in hand, not as an add-on: We often have "technical" classes (building a database) and ethical classes, but one should not be taught without the other. No perfect score in database building if you have not considered data storage ethics!
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(