last week i attended MLcon2.0 by @cnvrg_io and saw some great talks from all across the ML development stack

all of them are now available on-demand!

i'll call out some of my favorites here

cnvrg.io/mlcon-2
from @DivitaVohra, an overview of @Spotify's ML platform. super cool to hear how a product manager thinks about the problem of supporting ML systems

from @jeffboudier, an overview of the awesome work being done at @huggingface, with a focus on democratization of best practices, e.g. fast inference with Infinity

from @MarkMoyou of @nvidia, a really lucid overview of how to achieve low latency and high throughput in models on GPU. the visualization of sync/async GPU and CPU dataloaders is slick!

and last (in temporal order!) from my top 4, @LambdaAPI COO Mitesh Agrawal on how to build a datacenter for GPU-accelerated ML

🔑: different compute, storage, and networking needs for different parts of the pipeline (HPO, distributed training, inference)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Charles 🎉 Frye

Charles 🎉 Frye Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @charles_irl

Mar 3
im looking to start an interest group crossing over @full_stack_dl + @ml_collective!

we'll work through long-form content (h/t @chipro + @sh_reya) first, w sync discussions weekly to keep us on track

async folks can chat on discord, contribute to a wiki, + catch the recordings A discord post by charles_i...
this follows the format of really successful MLC interest groups in e.g. NLP (notion.so/MLC-NLP-Paper-…) and Computer Vision (notion.so/MLC-Computer-V…)

this group will focus on problems in production ML, like building datasets, monitoring models, and designing robust systems
we're organizing through the MLCollective Open Collab discord. won't link it on Twitter because bots + griefers

it's towards the bottom of the page here: mlcollective.org

come thru and drop some 🥞 if yr tryna!
Read 4 tweets
Mar 1
really cool new #AISTATS2022 paper presenting 1) a particular setting for model monitoring and 2) a provably optimal strategy for requesting ground truth labels in that setting.

plus a bonus example, and theorem, on why you shouldn't just do anomaly detection on logits!
scene: data in real life is non-stationary, meaning P(X,Y) changes over time.

our model performance is based on that joint distribution, so model performance changes over time, mostly downwards.

this is bad.

it's the ML equivalent of dependency changes breaking downstream code
worse still, we don't even know when our performance is degrading, because we don't know what the right answer was.

the slogan: "ML models fail silently".

kinda like databases without monitoring, unlike things that fail loudly, like programs that halt or throw errors
Read 33 tweets
Feb 26
Read through these awesome notes by @chipro and noticed something interesting about distribution shifts: they form a lattice, so you can represent them like you do sets, ie using a Venn diagram!

I find this view super helpful for understanding shifts, so let's walk through it. Image
(inb4 pedantry: the above diagram is an Euler diagram, not a Venn diagram, meaning not all possible joins are represented. that is good, actually, for reasons to be revealed!)
From the notes: joint distribution of data X and targets Y is shifting. We can decompose the joint into two pieces (marginal and conditional) in two separate ways (from Y or X).

There are four major classes of distribution shift, defined by which pieces vary and which don't. Image
Read 24 tweets
Feb 25
There's been some back-and-forth about this paper on getting gradients without doing backpropagation, so I took a minute to write up an analysis on what breaks and how it might be fixed.

tl;dr: the estimated gradients are _really_ noisy! like wow

charlesfrye.github.io/pdfs/SNR-Forwa…
The main result I claim is an extension of Thm 1 in the paper. They prove that the _expected value_ of the gradient estimate is the true gradient, and I worked out the _variance_ of the estimate.

It's big! Each entry has variance equal to the entire true gradient's norm😬 Image
(Sketch of the proof: nothing is correlated, everything has 0 mean and is symmetric around the origin, the only relevant terms are chi-squared r.v.s with known variances that get scaled by the gradient norms. gaussians are fun!)
Read 10 tweets
Nov 18, 2021
the final video for the @weights_biases Math4ML series, on probability, is now up on YouTube!

@_ScottCondron and I talk entropies, divergence, and loss functions

🔗:
this is the final video in a four-part series of "exercise" videos, where Scott and I work through a collection of Jupyter notebooks with automatically-graded Python coding exercises on math concepts

read more in this 🧵

each exercise notebook has a corresponding lecture video.

the focus of the lectures is on intuition, and in particular on intuition that i think programmers trying to get better at ML will grok
Read 7 tweets
Nov 8, 2021
New video series out this week (and into next!) on the @weights_biases YouTube channel.

They're Socratic livecoding sessions where @_ScottCondron and I work through the exercise notebooks for the Math4ML class.

Details in 🧵⤵️
Socratic: following an ancient academic tradition, I try to trick @_ScottCondron into being wrong, so that students can learn from mistakes and see their learning process reflected in the content.
(i was inspired to try this style out by the @PyTorchLightnin Master Class series, in which @_willfalcon and @alfcnz talk nitty-gritty of DL with PyTorch+Lightning while writing code. strong recommend!)

Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(