Tweet

Charles 🎉 Frye

24 Aug, 9 tweets, 4 min read

@PyTorch

If you're like me, you've written a lot of PyTorch code without ever being entirely sure what's _really_ happening under the hood.

Over the last few weeks, I've been dissecting some training runs using @PyTorch's trace viewer in @weights_biases.

Read on to learn what I learned!

I really like the "dissection" metaphor

a trace viewer is like a microscope, but for looking at executed code instead of living cells

its powerful lens allows you to see the intricate details of what elsewise appears a formless unity

kinda like this, but with GPU kernels:

number one take-away: at a high level, there's two executions of the graph happening.

one, with virtual tensors, happens on the CPU.

it keeps track of metadata like shapes so that it can "drive" the second one, with the real tensor data, that happens on the GPU.

@PyTorchPractice

in Good PyTorch Code™️, these two processes, connected by red lines in the screencap above, operate in parallel, with neither waiting on the other

does that sound right, @PyTorchPractice?

I cut my teeth on TensorFlow 1, where graphs were compiled ahead of time, and did a lot of my grad school work in classic CPU-only autograd, because I needed forward-mode differentiation for fast Hessians (don't ask), so this was not at all obvious to me!

This is also why it's so 🔑 that you use num_workers>0 in your DataLoader.

Otherwise, the CPU forward pass won't start until the batch has been loaded, and then the next batch won't start loading until the optimizer step is done.

That's a lot of (expensive!) idle GPU time😬

@karpathy

I learned a whole lot more using the trace viewer, including the reasoning behind most of @karpathy's hitherto mysterious tips on optimizing @PyTorch.

read more and learn how to use it yourself here:

wandb.me/trace-report

or just get your hands dirty with the Colab!

there's lots more to look into: fused ops, fused optimizers, the impact of benchmarking to pick kernels, ... you name it.

let me know if you find anything interesting!

wandb.me/trace-colab

@weights_biases

PS: this hot'n'fresh @weights_biases feature is courtesy of @vanpelt, who incorporated PyTorch's excellent trace viewer into our Artifacts system so that they can more easily be tracked, shared, and integrated into dashboards and reports

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @charles_irl

Charles 🎉 Frye

@charles_irl

31 Jul 20

https://twitter.com/DrLukeOR/status/1289305330027921408

another great regular online talk series! they're talking about GPT-3 now

https://twitter.com/DrLukeOR/status/1289305330027921408

@realSharonZhou

@realSharonZhou: sees opportunities in medicine for with "democratization" of design of e.g. web interfaces.

this could be key for healthcare providers who have clinical expertise and know what patients need but don't have web design skills.

@DrHughHarvey

@DrHughHarvey sees this as a step towards the holy grail of ML in radiology: a model that takes in an image and returns a full radiology report.

jump from GPT2 to GPT3 was just size. what might trillion-parameter models bring in other domains?

Read 8 tweets

Charles 🎉 Frye

@charles_irl

24 Jul 20

@daniela_witten

1/hella

this 🧵 by @daniela_witten is a masterclass in both the #SVD and in technical communication on Twitter.

i want to hop on this to expand on the "magic" of this decomposition and show folks where the rabbit goes, because i just gave a talk on it this week!

🧙‍♂️🐇💨😱

https://twitter.com/WomenInStat/status/1285612667839885312

tl;dr: the basic idea of the SVD works for _any_ function.

it's a three step decomposition:

- throw away the useless bits ⤵
- rename what remains 🔀
- insert yourself into the right context ⤴

@weights_biases

also, if you're more of a "YouTube talk" than a "tweet wall" kinda person, check out the video version, given as part of the @weights_biases Deep Learning Salon webinar series