Latest Twitter Threads by @full_stack_dl on Thread Reader App

Jul 25, 2023 • 9 tweets • 3 min read

Is it the revenge of recurrent nets? Is it a subquadratic Transformer?

It's both, it's neither, it's RWKV: @BlinkDL_AI's novel architecture that infers efficiently like an RNN but matches Transformer quality -- so far.

Deep dive by @charles_irl:

fullstackdeeplearning.com/blog/posts/rwk… > What is RWKV?

Typical RNNs are like a for loop that can't be vectorized, which hurts parallelization during training.

RWKV cleverly resolves this with a layer that works like an RNN cell when it's run step by step, but can be computed all at once like Transformer attention.

May 25, 2023 • 5 tweets • 5 min read

🆕 LLM Bootcamp videos are now available!

Check out our awesome invited speakers:

🏋🏻 @truerezashabani walks us through training LLMs at @Replit
🕵🏽 @hwchase17 talks about building agents with @LangChainAI
🔥 @npew talks about the path to @OpenAI ChatGPT @truerezashabani led the team that trained the new bespoke code completion models at @Replit.

He breaks down
· The Modern LLM Stack™️
· What makes a good "LLM engineer"
· The importance of knowing and cleaning your data

fullstackdeeplearning.com/llm-bootcamp/s…

May 23, 2023 • 16 tweets • 6 min read

🥞🦜 LLM Bootcamp 🦜🥞

Today, let's talk about UX.

tl;dr: LLMs unlock new user interaction design patterns based on language user interfaces (LUIs). But the same principles of user-centered design still apply!

Since the inception of computing programmers & designers have dreamed of interfacing with computers via language as naturally as we interface with each other.

Proof-of-concepts for such language user interfaces date back to the 60s and recur repeatedly.

LLMs make LUIs possible.

May 16, 2023 • 13 tweets • 5 min read

🥞🦜 LLM Bootcamp 🦜🥞

Today, let's talk about prompt engineering.

tl;dr Effective prompting requires some intuition about language models, but there's an emerging playbook of general techniques.

First off: What is a "prompt"? What is "prompt engineering"?

The prompt is the text that goes into your language model.

Prompt engineering is the design of that text: how is it formatted, what information is in it, and what "magic words" are included.

Apr 17, 2023 • 14 tweets • 5 min read

🦜 LLM Lit Review 🦜

Over the last two weeks, we tweeted out twelve papers we love in the world of language modeling, from agent simulation and browser automation to BERTology and artificial cognitive science.

Here they are, collected in a single 🧵 for your convenience. 1/12 - Reynolds and McDonell, 2021. "Prompt Programming for LLMs: Beyond the Few-Shot Paradigm"

The OG Prompt Engineering paper -- formatting ticks, agent sim, and chain-of-thought, before they were cool

https://twitter.com/full_stack_dl/status/1640738021854310401

Feb 21, 2023 • 4 tweets • 2 min read

Whatever our thoughts on chat _bots_, we enjoyed our chat with @hwchase17 of @LangChainAI on the most recent FSDL Tool Talk!

@charles_irl started us off with an overview of why we need LLM frameworks, then after a demo of how to use LangChain to do Q&A over the LangChain docs we did some live Q&A -- humans only.

Nov 7, 2022 • 5 tweets • 2 min read

In the past 12 weeks of FSDL 2022, we've shared eight lab notebooks, nine lecture videos, and twenty-five student projects, collecting each into a 🧵thread-of-threads 🧵.

Let's collect all that up into a 🧵thread of threads-of-threads 🧵so we can pin it and you can find it! 1/ 📜 Lectures 🧵 of 🧵s

https://twitter.com/full_stack_dl/status/1587104642004942849

Nov 3, 2022 • 8 tweets • 4 min read

In past threads, we've seen that our students this year built webcam-based visual Q&A, semantic search engines, and more.

In this final thread, we'll see the last few projects, from an iNaturalist-style plant identifer to a streaming art generator.

https://twitter.com/full_stack_dl/status/1587467007405932545

In-Browser AI: neural networks in the browser with the @onnxai web runtime.

Team: @visheratin

Check it out here: edge-ai.vercel.app

Nov 3, 2022 • 8 tweets • 3 min read

We're sharing some of our favorite student projects from this year's cohort.

Read on for a sign language word detector, a semantic image blurring system, and more!

https://twitter.com/full_stack_dl/status/1587467007405932545

👻 Image Anonymiser: building on object detectors to selectively remove information from images.

Team: Sami Saadaoui, Vladislav Vancak, Lawrence Francis, Dan Howarth, Allan Stevenson

Really nice writeup here: saadaosa.github.io/ImageAnonymise…

Nov 2, 2022 • 8 tweets • 2 min read

We've tweeted a lot of threads on the FSDL 2022 labs, which build a simple demo OCR app from scratch.

It may not have been obvious from the outside, but these labs are designed to each be useful independently.

So we've collected them up into a 🧵 thread-of-threads 🧵. (Pre)Labs 0-3 🧵: Overview, PyTorch, Lightning + CNNs, and Transformers

https://twitter.com/full_stack_dl/status/1554134881696772096

Nov 2, 2022 • 11 tweets • 7 min read

In our last thread, we made it through the first few student projects, including a video summarizer and an X-ray reader.

Now, let's do a few more -- like the nanofiber measurer and the recipe inverter.

https://twitter.com/full_stack_dl/status/1587467007405932545

Archaeological Feature Detector: automatically detect and interpret evidence for remains of prehistoric structures in LIDAR data.

Team: Philanoe, jmmoreu, kempbray, lakillo (github u/ns)

We love to see interdisciplinary work!

Nov 1, 2022 • 10 tweets • 4 min read

At FSDL, we believe in the power of shipping. So in the class, students build and share their own ML-powered products. This year, folks built all kinds of things, from a recipe inverter to a VTuber generator.

We'll be sharing some of our favorites here on Twitter. Course Co-Pilot: convert a YT video into a text summary.

Team: @waydegilliam, @kurianbenoy2, @suvash

We look forward to using this to automate lecture note generation!

Oct 31, 2022 • 11 tweets • 5 min read

Over the last few months of running the Full Stack Deep Learning course, we released one lecture video (+notes) each week and wrote an accompanying Twitter thread.

That's a lot of content, so here's a 🧵 thread-of-threads 🧵 collecting all of them up. Lecture 1 🧵: Course Vision and When to Use ML, by @josh_tobin_

https://twitter.com/full_stack_dl/status/1556807491601571840

Oct 26, 2022 • 8 tweets • 3 min read

🧪 Lab 5: Troubleshooting & Testing 🧪 didn't get a thread at time of release, so let's do one now!

For the testing part of the lab, we cover the basic tools with an eye on what works best for ML.

For example, Shellcheck to catch weird edge cases in the bash scripts that often glue ML pipeline steps together and fail silently with dire consequences:

shellcheck.net

Oct 17, 2022 • 6 tweets • 3 min read

We agree with @sh_reya, @rogarcia_sanz, et al. -- educational resources _have_ to keep up with the rapid changes in applied ML.

And we agree that it's an opportunity, not just a challenge!

If you're excited about this opportunity too, come join our community ➡️

First, we're discussing the "Operationalizing ML" interview study quoted above in our next reading group.

It's in ~20 hours from the posting of this tweet.

Sign up to join us, for this and future sessions, here:
crowdcast.io/e/fsdl-prodml-…

Sep 27, 2022 • 8 tweets • 3 min read

FSDL Lecture 8: ML products and orgs is now live!

Building an ML-powered product is about a lot more than just math and code.

In this lecture we cover some of the non-technical things you'll. need to know, from hiring to product management and product design.

Building any product is hard, but ML adds additional complexity:

- It's interdisciplinary
- Talent is scarce
- Many orgs don't "get it" yet
- Projects have a high degree of uncertainty
- Products often need to be redesigned with ML in mind

Sep 20, 2022 • 13 tweets • 8 min read

FSDL Lecture 7: Foundation Models is now available!

This lecture is 💯 new to the course.

We talk about building on Transformers, GPT-3, CLIP, StableDiffusion, and other foundation models.

Brief thread below.

https://twitter.com/sergeykarayev/status/1572027734276345858

The brave new world of large models is astonishing.

With scale, these models show emergent capabilities that seem truly magical.

At hundreds of billions of params, many GPUs are needed simply to load the model, and API-based access makes a lot of sense.

Sep 6, 2022 • 13 tweets • 5 min read

FSDL Lecture 6: Deployment is now live!

This lecture covers a critical step: getting your model into prod.

The key message is similar to our philosophy in other parts of the ML workflow:

Start simple, add complexity as you need it.

fullstackdeeplearning.com/course/2022/le…

When it's time to deploy, the first step is to create a prototype you and your friends / teammates can interact with.

@Gradio, @huggingface, and @streamlit are your friends at this stage.

You do want this to have a basic UI and be hosted behind a webserver to reduce friction.

Sep 1, 2022 • 5 tweets • 3 min read

🧪 FSDL Lab 6: Data Annotation 🧪

Try out @LabelStudioHQ and see how the tasty Tensor sausage gets made out of data chuck with our latest lab notebook and video!

However much you care about data, you should probably care more.

High-quality data is still a major differentiator for ML app quality.

And good understanding of the data is a major differentiator for ML engineer quality!

Aug 31, 2022 • 8 tweets • 6 min read

📀 FSDL Lecture 4: Data Management 📀

The key message is simple enough: become one with the data, and don't overcomplicate things 🙃.

Find the video and notes on our website, and check out the thread below for some condensed learnings first.

fullstackdeeplearning.com/course/2022/le…

First, we talk about data storage.

• Speed and bandwidth of disks varies a lot, so use NVMe SSDs
• Store binary data in standard formats like JPGs
• Store metadata and text as JSON or Parquet
• Databases are the best tool for deep work with structured data

Aug 23, 2022 • 11 tweets • 6 min read

FSDL Lecture 3: Troubleshooting & Testing is now live!

We cover:
• how to design software tests
• recommended tooling for testing and code quality assurance
• how to test ML systems, the easy and the hard way
• how to debug neural networks

(Link below)

The lecture video by @charles_irl is at

As always, our recommendations are specific and actionable. We recommend testing docstring code with doctests and quick-and-dirty notebook testing with nbformat.

Share this page!

Enter URL or ID to Unroll