Tweet

Charles 🎉 Frye

18 Nov, 7 tweets, 3 min read

@weights_biases

the final video for the @weights_biases Math4ML series, on probability, is now up on YouTube!

@_ScottCondron and I talk entropies, divergence, and loss functions

🔗:

https://twitter.com/charles_irl/status/1457840021772259332?s=20

this is the final video in a four-part series of "exercise" videos, where Scott and I work through a collection of Jupyter notebooks with automatically-graded Python coding exercises on math concepts

read more in this 🧵

https://twitter.com/charles_irl/status/1457840021772259332?s=20

each exercise notebook has a corresponding lecture video.

the focus of the lectures is on intuition, and in particular on intuition that i think programmers trying to get better at ML will grok

https://twitter.com/charles_irl/status/1348720290570792960

for linear algebra, we take a "programmer's view": arrays are functions that operate on arrays -- loops, parallelization, and higher-order functions all appear

https://twitter.com/charles_irl/status/1348720290570792960

https://twitter.com/charles_irl/status/1349093279111868416

for calculus, the focus is on approximation -- on using gradients as a "good enough" answer -- rather than on dynamics (as in physics) or limits (as in analysis)

https://twitter.com/charles_irl/status/1349093279111868416

https://twitter.com/charles_irl/status/1349456179483668480

for probability, we cover the importance of unlikely events and the utility of log-probabilities, which can be understood as a quantification of the commonsense notion of "surprise"

https://twitter.com/charles_irl/status/1349456179483668480

these are my own, hard-won intuitions for these topics, honed by trying to program learning machines for a decade with more chutzpah than formal mathematical training

i hope they are as useful for others as they have been for me!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @charles_irl

Charles 🎉 Frye

@charles_irl

8 Nov

@weights_biases

New video series out this week (and into next!) on the @weights_biases YouTube channel.

They're Socratic livecoding sessions where @_ScottCondron and I work through the exercise notebooks for the Math4ML class.

Details in 🧵⤵️

@_ScottCondron

Socratic: following an ancient academic tradition, I try to trick @_ScottCondron into being wrong, so that students can learn from mistakes and see their learning process reflected in the content.

@PyTorchLightnin

(i was inspired to try this style out by the @PyTorchLightnin Master Class series, in which @_willfalcon and @alfcnz talk nitty-gritty of DL with PyTorch+Lightning while writing code. strong recommend!)

Read 8 tweets

Charles 🎉 Frye

@charles_irl

24 Aug

@PyTorch

If you're like me, you've written a lot of PyTorch code without ever being entirely sure what's _really_ happening under the hood.

Over the last few weeks, I've been dissecting some training runs using @PyTorch's trace viewer in @weights_biases.

Read on to learn what I learned!

I really like the "dissection" metaphor

a trace viewer is like a microscope, but for looking at executed code instead of living cells

its powerful lens allows you to see the intricate details of what elsewise appears a formless unity

kinda like this, but with GPU kernels:

number one take-away: at a high level, there's two executions of the graph happening.

one, with virtual tensors, happens on the CPU.

it keeps track of metadata like shapes so that it can "drive" the second one, with the real tensor data, that happens on the GPU.

Read 9 tweets

Charles 🎉 Frye

@charles_irl

31 Jul 20

https://twitter.com/DrLukeOR/status/1289305330027921408

another great regular online talk series! they're talking about GPT-3 now

https://twitter.com/DrLukeOR/status/1289305330027921408

@realSharonZhou

@realSharonZhou: sees opportunities in medicine for with "democratization" of design of e.g. web interfaces.

this could be key for healthcare providers who have clinical expertise and know what patients need but don't have web design skills.

@DrHughHarvey

@DrHughHarvey sees this as a step towards the holy grail of ML in radiology: a model that takes in an image and returns a full radiology report.

jump from GPT2 to GPT3 was just size. what might trillion-parameter models bring in other domains?

Read 8 tweets

Charles 🎉 Frye

@charles_irl

24 Jul 20

@daniela_witten

1/hella

this 🧵 by @daniela_witten is a masterclass in both the #SVD and in technical communication on Twitter.

i want to hop on this to expand on the "magic" of this decomposition and show folks where the rabbit goes, because i just gave a talk on it this week!

🧙‍♂️🐇💨😱

https://twitter.com/WomenInStat/status/1285612667839885312

tl;dr: the basic idea of the SVD works for _any_ function.

it's a three step decomposition:

- throw away the useless bits ⤵
- rename what remains 🔀
- insert yourself into the right context ⤴

@weights_biases

also, if you're more of a "YouTube talk" than a "tweet wall" kinda person, check out the video version, given as part of the @weights_biases Deep Learning Salon webinar series

Read 19 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Thank you for your support!

Share this page!

Charles 🎉 Frye

Try unrolling a thread yourself!

More from @charles_irl

Charles 🎉 Frye

Charles 🎉 Frye

Charles 🎉 Frye

Charles 🎉 Frye

Did Thread Reader help you today?

Like this author's thread?