Also, I've been planning to start building stuff and learning about everything that is needed to build a fully-fledged web app - the perfect storm.
5/
I'm starting with front-end and working my way to back-end and MLOps.
Here is a nerdy summary of the process behind how the app was created.
Data portion part of the app (Python): 1) Figured out how to use YouTube Data API (developers.google.com/youtube/v3/get…) to scrape the video ids..
6/
from a particular YouTube channel -> this took way too much time :) the main idea is to find the "uploaded videos" playlist and get the ids from there (you can't get it from the channel itself…not intuitive).
..files, thumbnails, and video metadata -> amazing tool, but it took some time to figure out how to download exactly what I need. Documentation is very verbose. 😅
I was tracking/logging the temperature using HWiNFO64 - this is the first time I'm seriously using my GPU since I built the digital dragon so I wanted to be sure I won't set my house on fire. 😂
This was fairly easy as I was very familiar with the Whisper codebase...
9/
(made a video on it: ). First night: Windows update stopped my training… :) Beautiful.
4) Finally, I preprocessed the transcripts to remove the headers, preprocessed metadata, calculated estimated reading time (using the same heuristic as Medium)...
10/
..and resized my thumbnails (trying to linearize the story but this actually happened during the front-end app optimization step).
FE portion (@nextjs, @tailwindcss): 1) Did mini research on the FE tech to use: ended up w/ @reactjs i.e. Next.js (but along the way...
11/
considered Gatsby, Flask, Django, Angular, etc.). A lot of opinions on the internetz on what to use. Also learning a bit about Go use-cases and other BE tech. Learning JS concepts like async-await, serverless, etc.
2) Binge-watched videos on @reactjs and Next.js.
12/
(one React tutorial I watched was 12 hours long!). This took 2 days (all while working at DeepMind full-time). Strategy? 2x speed, using the skip 5 secs button, etc. Read the official Next.js tutorial (super useful!).
3) Wanted to minimize the amount of t spent on preps
13/
(stuff above) so I started coding immediately (I learn the best by doing). Found multiple Next.js templates I liked, got the projects working locally, analyzed how UI components I care about work, and merged stuff.
4) Problems in paradise -> had to learn to use Chrome dev..
14/
..tools, console.log won't cut it anymore. Watching tutorials, learned a ton about browsers in general, event loops, and all of the dev tools tabs (network, lighthouse, etc.).
Debugging server side code was a challenge -> chrome://inspect/#devices to the rescue...
15/
(vscode was not working properly so I ended up using chrome dev tools). Also debugging JS in vscode is not as intuitive as Python…
Setting up env with npm/yarn was also not without its problems. :)
5) Learning tailwind (CSS framework) ad hoc (no tutorials just googling
16/
+ @OpenAI ChatGPT! It's good, my God!) to design the navbar, email form, etc. Completely refactored the codebase, made it minimal, understood every single detail. Literally. I'm obsessed.
6) Hitting optimization issues. I realized I'm loading all of the transcripts and not
17/
even using them! (this boosted my Lighthouse perf scores from 20-> 90%!). Additionally: resizing thumbnails, and pruning metadata.
7) Testing testing! Tested the app with light/dark default OS theme, different devices (chrome dev tools again), fixed all console.log errors..
18/
8) Social image preview problems w/ @Twitter card -> turned out I need a robots.txt otherwise the Twitter bot can't pull the image to create the card!
9) Learning MailChimp API to add the backend functionality for subscribing to my list (you should do it! ❤️
19/
-> see the email box in the app). It's just not working. I downloaded Postman to debug it. Sneaky bug: I've manually and permanently removed my email from my audience (MailChimp lingo) - bc of that I can't add myself via the API!
Trying to make the confirmation email nicer
20/
..it's ugly. If anyone has a recommendation on what to use for this purpose (create an email list) let me know!
11) Bought the domain you saw above, hosting on hobby plan on Vercel, setting env vars for Mailchimp and GA.
21/
12) Beta testing with a small selected group of people (started last night) 😂 One of the feedback I got was: you're faster than my fastest FE developer, lol! Similar feedback from my friend who's an actual FE engineer! And I haven't been doing any FE until 7 days ago. 😅
22/
It's easy to pick up things once you have the fundamentals in place.
Additional remark: found @OpenAI Copilot, ChatGPT, vscode trio super useful during my dev process. I never felt like this before.
Looking into the future, going fwd many ideas on how to improve the app:
23/
* Add episode summaries using LLMs - transcripts are very long
* Group transcripts into chapters for easier navigation
* Add navbar and search on the transcripts page
* Add fuzzy or even better neural search so that you can find even more abstract thoughts
* Maybe add Q&A
24/
* Automate the process from the moment @hubermanlab uploads a video till the moment transcription is available on my app. Currently doing this semi-manually, with small overhead.
Any feedback is welcome!!! ❤️❤️❤️
End of MEGA thread.
25/25
• • •
Missing some Tweet in this thread? You can try to
force a refresh
[16:55 - 58:00] They introduced a prototype of their humanoid robot - Optimus. Only a concept last year - and now a reality. The progress was incredibly fast!
1/
Throughout the presentation, they stressed that there are so many parallels between building a humanoid robot and building a self-driving car. That's why the progress was so fast - they could reuse the supply chain, the training infra, etc.
2/
[01:17:00] They seem to be very bullish on using NeRF to aid their occupancy network predictions.
as @karpathy's reply to @tszzl here explains - we shouldn't confuse using NeRF as a change of mind - important whether the net is used at test/train time
If you want to understand how @StableDiffusion works behind the scenes I just made a deep dive video on it walking you through the codebase and papers step by step.
If you want to understand how Stable Diffusion works behind the scenes I walk you through the codebase (github.com/CompVis/stable…) step by step explaining:
1. First stage autoencoder training (with KL regularization)
2. Latent Diffusion Model training (UNet + cond model)
2/
3. Sampling using PLMS scheduler and how a link to differential equations enables us to sample much faster
Stable diffusion directly builds upon the "High-Resolution Image Synthesis with Latent Diffusion Models" paper: arxiv.org/abs/2112.10752
3/
[💥 Open-sourcing Stable Diffusion scripts 💥] Folks if you missed this one I open-sourced a script that should make it super easy to get started playing with stable diffusion!