Aleksa Gordić 🍿🤖 Profile picture
https://t.co/mcuQvV7YOC proud father of 16 A100s flirting with LLMs, tensor core maximalist micromortmaxxer x @GoogleDeepMind @Microsoft
Jerome Ku Profile picture 1 subscribed
Apr 16, 2023 7 tweets 3 min read
[🤖 This is BIG!] The best truly open-source ChatGPT alternative just came in! OpenAssistant! In a user study, they showed that OpenAssistant replies are on par with ChatGPT (48.3 vs 51.7%)! 🤯

Try it out: open-assistant.io/chat
@ykilcher's vid:

1/ 👇🧵 Image Additionally, I assume the model will suffer far less from "corporate speech" which will make it way more fun to use! Let me know how you like it once you try it down in the comment section. :))

2/
Apr 15, 2023 4 tweets 2 min read
Insightful new blog post by @chipro covers a broad set of LLM-related topics fairly concisely.

Great if you want to get broad exposure to the field: huyenchip.com/2023/04/11/llm…

The topic covered include:

* What to do with LLMs' output issues due to the inherent ambiguity

1/ 🧵👇 Image of natural language (e.g. your output format might be violated, your outputs could vary without you changing the input, etc.)

* Prompt versioning (similar to how you version code, data...) and optimization (CoT - chain of thought prompting, self-consistency technique, etc.)

2/
Apr 15, 2023 10 tweets 3 min read
[IMPORTANT ❗] I feel a sense of duty to warn people about social media posts they'll be seeing in AI in the upcoming months.

Web3/crypto/salesy vibes came to AI big time. Huge monetary incentives are at play so many of them flocked to the space...

1/ 🧵👇 Image (not judging it's just a fact - also there are as always exceptions!).

I'm a techno-optimist but there is hyping and there is hyping so hard that you're basically lying and spreading misinformation.

A concrete example:

People saying AutoGPT, a project that started...

2/
Mar 14, 2023 16 tweets 5 min read
BIG LIFE ANNOUNCEMENT: I'm leaving @DeepMind to start my own company. I'm 28 now. This is a start of a new life chapter.

I'm both happy and sad.

❤️ I'm happy because I've been planning on starting my own company ever since I graduated from college back in 2017.

1/ (MEGA 🧵) Image In the long run, I always felt that's going to be the best way to maximize my positive impact on the world.

My plan was to gain some real-world experience in the top tech companies before I go on to start my own thing.

That moment has finally come.

2/
Dec 28, 2022 25 tweets 10 min read
[🤖 Build time! 🧠] I'm so excited to announce my new project: Andrew Huberman podcast transcripts! 🎉🥳

hubermantranscripts.com

Quickly search for an episode, find highly accurate transcripts, click and be directed to the exact timestamp in the YouTube video!

@hubermanlab
1/
Find valuable information that @hubermanlab gave us for free. I know It had a tremendous positive impact on my life.

With this one, I'm opening up a series of projects that I'll be building over the next year!

Took me ~3 full days to transcribe all of the videos using my...

2/
Nov 24, 2022 9 tweets 3 min read
Enjoying the Silicon Valley! :)

More photos around the @Google campus in the thread below! 👇

1/ Image Slav squatting with Androids - err excuse me, *slavdroids

2/ Image
Oct 1, 2022 6 tweets 4 min read
Watched the whole @Tesla AI day video:

Some takeaways:

[16:55 - 58:00] They introduced a prototype of their humanoid robot - Optimus. Only a concept last year - and now a reality. The progress was incredibly fast!

1/ Throughout the presentation, they stressed that there are so many parallels between building a humanoid robot and building a self-driving car. That's why the progress was so fast - they could reuse the supply chain, the training infra, etc.

2/
Sep 1, 2022 4 tweets 3 min read
If you want to understand how @StableDiffusion works behind the scenes I just made a deep dive video on it walking you through the codebase and papers step by step.

YT:

This is one of my most detailed deep dives so far

@robrombach @andi_blatt @pess_r
1/ If you want to understand how Stable Diffusion works behind the scenes I walk you through the codebase (github.com/CompVis/stable…) step by step explaining:

1. First stage autoencoder training (with KL regularization)

2. Latent Diffusion Model training (UNet + cond model)

2/
Aug 30, 2022 4 tweets 2 min read
[💥 Open-sourcing Stable Diffusion scripts 💥] Folks if you missed this one I open-sourced a script that should make it super easy to get started playing with stable diffusion!

The code is here: github.com/gordicaleksa/s…

1/ It supports generating a diverse set of images, interpolating in the latent space, and thus creating (mostly) smooth transitions in the image space!

The image you see above was generated using the prompt:

"a painting of an ai robot having an epiphany moment" 🤖🤖🤖

2/
Aug 29, 2022 5 tweets 4 min read
[🤯 Stable Diffusion 💥] If you wanted to get started with Stable Diffusion this video is for you!

Includes a walk-through of my code inspired by @karpathy's gist: github.com/gordicaleksa/s…

YT:

Thanks @EMostaque and the team for making this possible.

1/ I show you 3 ways to get started with Stable diffusion:

1. Using @huggingface Spaces (super slow, but super easy)

2. Using diffusers Colab notebooks (mid-ground). Thanks @psuraj28, @pcuenq for making these!

3. Running it locally (my code, most control/flexibility)

2/
Aug 18, 2022 9 tweets 4 min read
Took some time to read through the logs behind @BigscienceW's BLOOM and @MetaAI's OPT-175B model training.

It's amazing they shared these publicly.

LLM training is true alchemy and modern-day babysitting.

Some examples that cracked me up

from github.com/facebookresear…

1/ 👇🧵 This is what perplexity vs wall-clock time looks like when training LLMs. 😅

You can almost taste that suffering

2/
Aug 3, 2022 5 tweets 2 min read
During 2020 I started logging my ML journey that eventually led to me landing a job at DeepMind - and I'm so happy I've done it!

For multiple reasons:

* I forced myself to distill everything I've learned, and that compression/reflection solidified my knowledge

1/ 👇🧶 * I feel I helped others going on a similar path (although the logs are fairly meta as well - it's a more general learning blueprint) as well as the future me!

* It's a nice historical document and a public artifact that I am proud of.

2/
Aug 1, 2022 6 tweets 2 min read
[learning machine learning 🧠] Don't fall into the same trap as many - namely trying to overengineer your curriculum when you're just getting started (and later as well).

You'll just end up with a decision making paralysis and eventually you'll end up giving up...

1/ 👇🧶 - which is the only bad outcome (unless it comes from a place of deep self-awareness).

You ask yourself the following questions:

What's the best ML course out there? Should I do X, Y or Z? The reputable guy on Reddit said Y, @ylecun said Z, and my professor said W.

2/
Jul 31, 2022 12 tweets 3 min read
I get asked a lot about what does it take to land a job at DeepMind or any other world-class AI industry lab.

For those of you that are unaware of it I wrote a detailed blog on that topic and shared my personal journey here: link.medium.com/dV0H7fay6rb

If I could summarize..

1/ 🧶 ..my tips the main ones would be:

1) Have a lot of tenacity - it takes a lot of hard work, patience and consistency. The good thing is - this can be learned/practiced! For me personally I built this part of my personality through sports (calisthenics, running, martial arts..

2/
Jul 30, 2022 8 tweets 2 min read
I wish I learned how to learn while I was still a kid. For some reason they don't teach us this in schools and everyone is left to figure it out on their own - which is sad, as most people never take the time to learn this.

A while ago I wrote a blog on this topic...

1/ 👇🧶 inspired by the "Learning How To Learn" @coursera course:
link.medium.com/0ZUtyq714rb

I strongly recommend you read it. Taking a step back from "actual learning" to boost your learning effiency is a time well spent.

IMHO, things that will benefit everyone should be...

2/
Jul 29, 2022 7 tweets 3 min read
If you truly want to become proficient with machine learning (I really don't like the word expert) try to get out from the "going through the newest courses and books" phase as soon as possible.

Too many people keep on reading the newest books that come out...

1/
(and same for courses), thinking they are now up to date with ML world whereas in reality they are "light years" behind (things move fast around here 😇).

Try to get into the paper reading and replication phase as fast as you can without skipping the necessary steps.

2/
Jun 8, 2022 4 tweets 3 min read
Very much enjoyed doing this podcast with @LeiserNeil from AI Stories, discussing what makes one a good ML engineer, imposter syndrome, my career path and challenges I faced, my path to @DeepMind, how to learn ML, and much more!

Watch here:

1/ 🧵 It's my first podcast ever and it was already long due! :))

Neil reached out already in January this year but since I had so much going on (moving to London, starting a new job + bunch of personal things) I had to postpone it until now - but now it's up!

2/
Jun 8, 2022 10 tweets 3 min read
[🧠 Getting started with biology 🧠] I just finished the best MOOC course I've done in my life: "Introduction to Biology - The Secret of Life" offered on edX by the famous @eric_lander (Humane Genome project guy).

I've collected a ton of notes over the last month or so.

1/ 🧵👇 My idea is to start sharing my learnings and notes over the next weeks - do let me know down in the comments whether you'd find that useful!

A bit about the course:

* You'll learn the fundamentals of biochemistry, genetics/genomics, and more - enough to understand...

2/
Jun 7, 2022 12 tweets 3 min read
[🧠 Interesting read 🧠] "Can a Biologist Fix a Radio? or, What I Learned while Studying Apoptosis" paper. So why is it interesting?

The state of AI seems to be somewhere in the middle between the experimental biologist's approach, as described in this paper, and...

1/ ...the classical engineering approach.

The author observes and describes funny (and all too familiar?) boom and bust cycles that happen in biology while trying to understand complex phenomena with the promise of discovering a miracle drug that will solve all our problems.

2/
Apr 8, 2022 7 tweets 4 min read
Everyone is hyped up about @OpenAI's DALL-E 2 model atm and most people have noticed this cryptic "signature" code at the end of their images - but how many of you understand what it stands for?

I did some research and found the answer! 🔎🧠 I couldn't believe it.

Thread 👇🧵1/ I've decided to do some deciphering. Is it a simple color code? What are @sama et al. trying to tell us?

I opened up my Photoshop and selected the color picker tool.

I first extracted 5 RGB tuples (R, G, B) from the 5 colors of the signature.

2/
Jan 9, 2022 7 tweets 3 min read
[🧠 Paper Summary 📚] An interesting paper was recently published to arxiv: "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets" (although it originally appeared in May 2021).

The main idea is this:
1/ 🧵 If you have an overparametrized neural network (more params than the # of data points in your dataset) and you train it way past the point where it has memorized the training data (as suggested by the low training loss, and high val loss), all of a sudden the network will...

2/