[🤖 Build time! 🧠] I'm so excited to announce my new project: Andrew Huberman podcast transcripts! 🎉🥳

hubermantranscripts.com

Quickly search for an episode, find highly accurate transcripts, click and be directed to the exact timestamp in the YouTube video!

@hubermanlab
1/
Find valuable information that @hubermanlab gave us for free. I know It had a tremendous positive impact on my life.

With this one, I'm opening up a series of projects that I'll be building over the next year!

Took me ~3 full days to transcribe all of the videos using my...

2/
...new @nvidia RTX 3090 machine ("digital dragon" - you can find the playlist of how I've built it here:
😍).

It took me 7 DAYS to learn everything I need to know about front-end (FE) development to build this app. :) (@nextjs + @tailwindcss)

3/
(but I did work like a maniac 😅)

Your feedback is very much needed!! ❤️ Let me know in the comments what is not working, what new features you'd want, etc.!

The story behind how this happened is in the comments! 👇 (long thread! medium blog coming by EOY)

4/
I was inspired by @karpathy's karpathy.ai/lexicap/ and so I decided to do a similar thing for my favorite podcast.

Also, I've been planning to start building stuff and learning about everything that is needed to build a fully-fledged web app - the perfect storm.

5/
I'm starting with front-end and working my way to back-end and MLOps.

Here is a nerdy summary of the process behind how the app was created.

Data portion part of the app (Python):
1) Figured out how to use YouTube Data API (developers.google.com/youtube/v3/get…) to scrape the video ids..

6/
from a particular YouTube channel -> this took way too much time :) the main idea is to find the "uploaded videos" playlist and get the ids from there (you can't get it from the channel itself…not intuitive).

2) Used yt-dlp (github.com/yt-dlp/yt-dlp) to download the audio..

7/
..files, thumbnails, and video metadata -> amazing tool, but it took some time to figure out how to download exactly what I need. Documentation is very verbose. 😅

3) Used @OpenAI Whisper large v2 (github.com/openai/whisper) to transcribe the audio (took 3 days on my RTX 3090)

8/
I was tracking/logging the temperature using HWiNFO64 - this is the first time I'm seriously using my GPU since I built the digital dragon so I wanted to be sure I won't set my house on fire. 😂

This was fairly easy as I was very familiar with the Whisper codebase...

9/
(made a video on it: ). First night: Windows update stopped my training… :) Beautiful.

4) Finally, I preprocessed the transcripts to remove the headers, preprocessed metadata, calculated estimated reading time (using the same heuristic as Medium)...

10/
..and resized my thumbnails (trying to linearize the story but this actually happened during the front-end app optimization step).

FE portion (@nextjs, @tailwindcss):
1) Did mini research on the FE tech to use: ended up w/ @reactjs i.e. Next.js (but along the way...

11/
considered Gatsby, Flask, Django, Angular, etc.). A lot of opinions on the internetz on what to use. Also learning a bit about Go use-cases and other BE tech. Learning JS concepts like async-await, serverless, etc.

2) Binge-watched videos on @reactjs and Next.js.

12/
(one React tutorial I watched was 12 hours long!). This took 2 days (all while working at DeepMind full-time). Strategy? 2x speed, using the skip 5 secs button, etc. Read the official Next.js tutorial (super useful!).

3) Wanted to minimize the amount of t spent on preps

13/
(stuff above) so I started coding immediately (I learn the best by doing). Found multiple Next.js templates I liked, got the projects working locally, analyzed how UI components I care about work, and merged stuff.

4) Problems in paradise -> had to learn to use Chrome dev..

14/
..tools, console.log won't cut it anymore. Watching tutorials, learned a ton about browsers in general, event loops, and all of the dev tools tabs (network, lighthouse, etc.).

Debugging server side code was a challenge -> chrome://inspect/#devices to the rescue...

15/
(vscode was not working properly so I ended up using chrome dev tools). Also debugging JS in vscode is not as intuitive as Python…

Setting up env with npm/yarn was also not without its problems. :)

5) Learning tailwind (CSS framework) ad hoc (no tutorials just googling

16/
+ @OpenAI ChatGPT! It's good, my God!) to design the navbar, email form, etc. Completely refactored the codebase, made it minimal, understood every single detail. Literally. I'm obsessed.

6) Hitting optimization issues. I realized I'm loading all of the transcripts and not

17/
even using them! (this boosted my Lighthouse perf scores from 20-> 90%!). Additionally: resizing thumbnails, and pruning metadata.

7) Testing testing! Tested the app with light/dark default OS theme, different devices (chrome dev tools again), fixed all console.log errors..

18/
8) Social image preview problems w/ @Twitter card -> turned out I need a robots.txt otherwise the Twitter bot can't pull the image to create the card!

9) Learning MailChimp API to add the backend functionality for subscribing to my list (you should do it! ❤️

19/
-> see the email box in the app). It's just not working. I downloaded Postman to debug it. Sneaky bug: I've manually and permanently removed my email from my audience (MailChimp lingo) - bc of that I can't add myself via the API!

Trying to make the confirmation email nicer

20/
..it's ugly. If anyone has a recommendation on what to use for this purpose (create an email list) let me know!

10) Setting up analytics: Google (GA) + @vercel.

11) Bought the domain you saw above, hosting on hobby plan on Vercel, setting env vars for Mailchimp and GA.

21/
12) Beta testing with a small selected group of people (started last night) 😂 One of the feedback I got was: you're faster than my fastest FE developer, lol! Similar feedback from my friend who's an actual FE engineer! And I haven't been doing any FE until 7 days ago. 😅

22/
It's easy to pick up things once you have the fundamentals in place.

Additional remark: found @OpenAI Copilot, ChatGPT, vscode trio super useful during my dev process. I never felt like this before.

Looking into the future, going fwd many ideas on how to improve the app:

23/
* Add episode summaries using LLMs - transcripts are very long
* Group transcripts into chapters for easier navigation
* Add navbar and search on the transcripts page
* Add fuzzy or even better neural search so that you can find even more abstract thoughts
* Maybe add Q&A

24/
* Automate the process from the moment @hubermanlab uploads a video till the moment transcription is available on my app. Currently doing this semi-manually, with small overhead.

Any feedback is welcome!!! ❤️❤️❤️

End of MEGA thread.

25/25

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Aleksa Gordić @ NeurIPS 🍿🤖

Aleksa Gordić @ NeurIPS 🍿🤖 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @gordic_aleksa

Nov 24
Enjoying the Silicon Valley! :)

More photos around the @Google campus in the thread below! 👇

1/ Image
Slav squatting with Androids - err excuse me, *slavdroids

2/ Image
Bay View buildings are beautiful

3/ Image
Read 9 tweets
Oct 1
Watched the whole @Tesla AI day video:

Some takeaways:

[16:55 - 58:00] They introduced a prototype of their humanoid robot - Optimus. Only a concept last year - and now a reality. The progress was incredibly fast!

1/
Throughout the presentation, they stressed that there are so many parallels between building a humanoid robot and building a self-driving car. That's why the progress was so fast - they could reuse the supply chain, the training infra, etc.

2/
[01:17:00] They seem to be very bullish on using NeRF to aid their occupancy network predictions.

as @karpathy's reply to @tszzl here explains - we shouldn't confuse using NeRF as a change of mind - important whether the net is used at test/train time

3/
Read 6 tweets
Sep 1
If you want to understand how @StableDiffusion works behind the scenes I just made a deep dive video on it walking you through the codebase and papers step by step.

YT:

This is one of my most detailed deep dives so far

@robrombach @andi_blatt @pess_r
1/
If you want to understand how Stable Diffusion works behind the scenes I walk you through the codebase (github.com/CompVis/stable…) step by step explaining:

1. First stage autoencoder training (with KL regularization)

2. Latent Diffusion Model training (UNet + cond model)

2/
3. Sampling using PLMS scheduler and how a link to differential equations enables us to sample much faster

Stable diffusion directly builds upon the "High-Resolution Image Synthesis with Latent Diffusion Models" paper: arxiv.org/abs/2112.10752

3/
Read 4 tweets
Aug 30
[💥 Open-sourcing Stable Diffusion scripts 💥] Folks if you missed this one I open-sourced a script that should make it super easy to get started playing with stable diffusion!

The code is here: github.com/gordicaleksa/s…

1/
It supports generating a diverse set of images, interpolating in the latent space, and thus creating (mostly) smooth transitions in the image space!

The image you see above was generated using the prompt:

"a painting of an ai robot having an epiphany moment" 🤖🤖🤖

2/
Let me know if you generate something cool, feel free to tag me @gordic_aleksa - would love to see your results! :))

#stablediffusion

3/3
Read 4 tweets
Aug 29
[🤯 Stable Diffusion 💥] If you wanted to get started with Stable Diffusion this video is for you!

Includes a walk-through of my code inspired by @karpathy's gist: github.com/gordicaleksa/s…

YT:

Thanks @EMostaque and the team for making this possible.

1/
I show you 3 ways to get started with Stable diffusion:

1. Using @huggingface Spaces (super slow, but super easy)

2. Using diffusers Colab notebooks (mid-ground). Thanks @psuraj28, @pcuenq for making these!

3. Running it locally (my code, most control/flexibility)

2/
I published the code here: github.com/gordicaleksa/s…

I'll be improving it over the next week or so. :) Feel free to submit issues if you hit any errors.

Also - watch out for my deep dive into stable diffusion if you want to learn how it works behind the scenes! (coming soon)

3/
Read 5 tweets
Aug 18
Took some time to read through the logs behind @BigscienceW's BLOOM and @MetaAI's OPT-175B model training.

It's amazing they shared these publicly.

LLM training is true alchemy and modern-day babysitting.

Some examples that cracked me up

from github.com/facebookresear…

1/ 👇🧵
This is what perplexity vs wall-clock time looks like when training LLMs. 😅

You can almost taste that suffering

2/
This one cracked me up!

Give us our money back 😂

3/
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(