Zach Mueller Profile picture
Building things openly 🤗 | CS nerd in a DL world | Lifts to stay sane | #ADHD | He/Him
Aug 28, 2023 7 tweets 2 min read
Excited to announce a new @huggingface space to help with one of machine learning's biggest questions:

How much space does {X} model take in vRAM? And most importantly: when using `device_map="auto"`

huggingface.co/spaces/hf-acce…
This space helps utilize accelerate`s big model inference library and device_map="auto" to load in a model and measure it's skeleton to help you estimate the largest layer and the total size of the model you want to load into memory! (and only using a tiny amount of RAM) Image
Oct 13, 2022 5 tweets 3 min read
Today marks an extremely exciting day for fans of #nbdev, I'm releasing a new project, "nbdev-extensions"! This pypi package will contain features myself and others have thought of and I've brought to life in the nbdev framework for everyone to try!

muellerzr.github.io/nbdev-extensio…

1/5 The first extension is a `new_nb` command. This will quickly generate a new blank template notebook for you to immediately dive into as you're exploring nbdev, and is fully configurable for how your notebook's content should be:

2/5
Sep 2, 2022 9 tweets 5 min read
You may know that @huggingface Accelerate has big-model inference capabilities, but how does that work?

With the help of #manim, let's dig in!

Step 1:
Load an empty model into memory using @PyTorch's `meta` device, so it uses a *super* tiny amount of RAM Step 2:
Load a single copy of the model's weights into memory
Jul 6, 2022 4 tweets 2 min read
New article on #python decorators is out! Specifically this shows you how decorators are written, what they do, and the power you can do with them. I even show an example of when you'd use the strange "nonlocal" 1/3
muellerzr.github.io/fastblog/pytho… Context manager sequel should be out in the next few days. This one will take a bit longer because in some cases decorators are context managers, and they also have a few more rules so it'll take some time for me to get that how I want it :) 2/3
Jul 5, 2022 4 tweets 3 min read
Listened to everyone's response with the new `no_sync` wrapper in @huggingface's Accelerate and I took it to heart.

Here's our new gradient accumulation context manager available in Accelerate dev now! A thread on design choices and the struggles 1/4🧵 Image @huggingface The goal with Accelerate is abstract as very little as we possibly can for you to perform what you want on any training device (CPU, multi-gpu, etc). As a result, it came to a decision of "how can we simplify gradient accumulation, without hiding anything?" 2/4
May 19, 2022 4 tweets 2 min read
A few tips and tricks I learned about @Docker today and keeping image sizes small 🧵 Use a multi-stage approach to keep the resulting image lightweight by pre-compiling all of the installs and then just bringing in those installed files to the end image. I could save 500mbs + in some cases by doing this
Feb 6, 2022 5 tweets 2 min read
Tonight we're talking about @fastdotai's `tabular_learner`, and more specifically the TabularModel 🧵 The role of the `tabular_learner` is to mostly build a `TabularModel` for your data. This tabular model is a series of embedding matrices and some batch normalization, before going through a few rounds of LinBnDrop, as shown below 2/
Feb 4, 2022 5 tweets 2 min read
What is @fastdotai's `cnn_learner`, and what magic does it do? 🧵 The `cnn_learner` builds a fastai Learner designed for specifically vision transfer learning, using some of the best practical practices.

We start with a baseline `arch`, such as a resnet34, cut off the last layer, and introduce a @fastdotai head (such as below) for our task 2/
Nov 13, 2021 7 tweets 2 min read
Gave it a second read through (I had the opportunity to read the first draft a while ago), below you can find a thread of my review, and some bits I enjoyed from it: This book is an excellent companion to something like the @fastdotai book, course, or Walk with fastai. It explores some areas differently than what is presented in the course, which can perhaps help folks get a better grasp of some concepts. 1/
Nov 11, 2021 4 tweets 1 min read
It's always a welcome surprise when I see fastinference being used 😁 🤯Okay, actually numbers I DID NOT expect. The last release of fastinference was in MARCH... Image
Nov 10, 2021 8 tweets 2 min read
Why does #nbdev do such weird namings for your notebook, such as "00_core.ipynb?"

There's actually a few reasons. Let's talk about that 🧵 First, it helps keep things organized module wise. Having everything numerical let's you section off by groups how certain segments of code are laid out.

An example of this is in @fastdotai, where notebooks starting with 20 are generally vision tutorials
Nov 9, 2021 4 tweets 2 min read
So. Post graduation plans:

- @jeremyphoward's Matrix Calculus for DL
- @math_rachel's Computational Linear Algebra

- W&B's Math for ML

(In this order) Matrix Calc: explained.ai/matrix-calculu…
Aug 26, 2021 13 tweets 2 min read
I'm going to be releasing a video extremely soon on my journey through fastai, open source, and how it all merges together. In the meantime, I wanted to outline below the Software Design and Development program my school (@UWF) offers semester by semester: 🧵 Semester 1:

- C++ Programming. Getting you used to the nuances of memory, objects, variables, and so forth
Aug 25, 2021 4 tweets 2 min read
Friends that are familiar with @github actions, is it possible to deploy to gh pages by using files generated *from* an action? I don't mean building to another branch and then deploying that branch (that I can do), I mean using in-memory files to deploy from @github Scenario: I have a bunch of .md's I've made to build some docs, but I want to segment out another git repository that handles things like the Gemfiles and whatnot. What this action should do is pull those gemfiles (which I can already do) and then deploy on this current state
Aug 24, 2021 6 tweets 2 min read
As promised, here is our textbooks and optional readings for Software Engineering 2 and Software Engineering Management. (There were none for SE1) 🧵 Software Engineering 2:

Martin, Robert. Clean Architecture: A Craftman’s Guide to Software Structure
and Design. Prentice Hall, 2018. ISBN: 978-0134494166

We discussed how one properly deals with writing clean code and handling issues within them 2/
Aug 7, 2021 8 tweets 2 min read
How can you learn to use the @fastdotai framework to its fullest extent? A thread on what I believe is the most important lesson you can teach yourself: 👇

1/
First: fixing a misconception. At its core, fastai is just PyTorch. It uses torch tensor, trains with the torch autograd system, and uses torch models.

Don't believe me? Let's talk about how you can learn to see this and utilize it

2/
Jun 2, 2021 7 tweets 2 min read
I've written a notebook showing three additional functions for helping you navigate the @fastdotai source code and save you potentially hours of time: gist.github.com/muellerzr/3302…

1/
Why did I write this? Navigating the source code for fastai can be hard sometimes, especially trying to consolidate all the patch and typedispatch functionalities (especially because typedispatch doesn't show up in the __all__!)

So, how does this work? 2/
May 20, 2021 10 tweets 3 min read
I got asked a question at the end of my stream by @lukemshepherd I didn't quite get to answer:

​Do you have any advice for someone who wants to get into contributing to the @fastdotai library who has experience with fastai but not software development?

Let's try to answer: If you have absolutely zero, I recommend watching the first few lectures in the latest part 2 (yes of a few years ago!). Since it's building a library from scratch, Jeremy covers how he approaches software design, which can help you understand the design
May 6, 2021 5 tweets 2 min read
With school over, I'll reveal one of my secrets. I'm making a new course!

Running with @fastdotai! Times and dates are TBD, but I'm shooting for this Fall to hold the course. This will be a continuation on Walk with fastai, building on what we learned there and applying it 1/ The plan is to split it up into three sections: Debugging, Implementations, and Guest Speakers.

The first section I want to be covering debugging in fastai, bringing raw torch code over (direct 1:1 conversions), and exploring inference heavily
May 5, 2021 10 tweets 6 min read
It's a very big day for me today, as I'm officially releasing version 0.2.3 for the @AdaptNLP project. With it comes a slew of changes, but what exactly?

A preview:
#nbdev, @fastdotai, #fastcore, and something called a ModelHub for both #FlairNLP and @huggingface, let's dive in: First up, #nbdev:

Thanks to the lib2nbdev package (novetta.github.io/lib2nbdev), we've completely restructured the library to become test-driven development with nbdev, with integration tests and everything else that comes with the workflow 2/9
Apr 14, 2021 16 tweets 6 min read
Deploying with @fastdotai isn't always learn = load_learner(), learn.predict. There are numerous scenarios when you might only want some, part, or none of both the API and the library as a whole. In this thread we will be exploring your options, how they work, and what to do: 1/n Ideally we have the following context:

DataBlock -> DataLoaders -> Model -> Learner -> Train

This can then stem off to a few things:

1. learn.export() -> Model and DataLoaders (which are now blank) ...