1/5
PyTorch 2.0 introduces torch.compile, a compiled mode that accelerates your model without needing to change your model code. On 163 open-source models ranging across vision, NLP, and others, we found that using 2.0 speeds up training by 38-76%.
2/5
Oct 25, 2022 • 4 tweets • 2 min read
On #TutorialTuesdays we revisit resources to power your PyTorch learning journey. This week’s beginner basics text based lesson is in two parts — Datasets & Dataloaders and Transforms. Read on for highlights of what you’ll learn: 🧵👇bit.ly/3D8cmyy
1/4
In PyTorch Datasets & Dataloaders Tutorial: Code for processing data samples needs to be organized. Dataset code should be decoupled from model training code for better readability & modularity. Learn in 5 steps.
2/4
Nov 16, 2021 • 10 tweets • 11 min read
Get ready for PyTorch Developer Day on December 1-2, 2021! We’ve got an amazing lineup of speakers for you on Day 1.
Check the thread below to see the speakers ⬇️
1/10
🎙Keynote Speakers🎙 1. Lin Qiao - Engineering Director @MetaAI 2. @DougalMaclaurin - Sr. Research Scientist @Google 3. Philippe Tillet - Member of Technical Staff @OpenAI 4. @dwarak - Engineering Director @MetaAI 5. @dzhulgakov - Software Engineer @MetaAI
2/10
Oct 25, 2021 • 10 tweets • 4 min read
ICYMI: PyTorch 1.10 was released last Thursday. Here are some highlights of the release.
Stay tuned for tweet threads in the next couple weeks delving deeper into these cool new features!
1/8
CUDA Graphs are now in beta, and allow you to capture (and replay!) static CUDA workloads without needing to relaunch kernels, leading to massive overhead reductions! Our integration allows for seamless interop between CUDA graphs and the rest of your model.
2/9
Oct 19, 2021 • 11 tweets • 4 min read
✨ Low Numerical Precision in PyTorch ✨
Most DL models are single-precision floats by default.
Lower numerical precision - while reasonably maintaining accuracy - reduces:
a) model size
b) memory required
c) power consumed
Thread about lower precision DL in PyTorch ->
1/11
Lower precision speeds up :
* compute-bound operations, by reducing load on the hardware
* memory bandwidth-bound operations, by accessing smaller data
In many deep models, memory access dominates power consumption; reducing memory I/O makes models more energy efficient.
2/11
Sep 14, 2021 • 7 tweets • 3 min read
Want to make your inference code in PyTorch run faster? Here’s a quick thread on doing exactly that.
1. Replace torch.no_grad() with the ✨torch.inference_mode()✨ context manager.
2. ⏩ inference_mode() is torch.no_grad() on steroids
While NoGrad excludes operations from being tracked by Autograd, InferenceMode takes that two steps ahead, potentially speeding up your code (YMMV depending on model complexity and hardware)