Radek Osmulski πŸ‡ΊπŸ‡¦ Profile picture
LLMs and retrieval by day, training robots on the weekend πŸ§ͺ Senior Data Scientist @NVIDIAAI 🏫 @fastdotai trained DL Eng πŸ“ https://t.co/By87iXx5Pu
Apr 11, 2023 β€’ 9 tweets β€’ 4 min read
How does LangChain actually work?

We see the wonderful things it can do, but what does it send to the model?

What does the model send back?

How does it all work?

I decided to investigate πŸ•΅οΈβ€β™‚οΈ

Here is how LangChain allows LLMs to perform Google searches: Here is the template of the prompt.

But what does the model send back?
How does the scratchpad work?
Apr 3, 2023 β€’ 9 tweets β€’ 4 min read
πŸ’‘ How do you use Deep Learning to prevent retail theft?

A fascinating GTC talk on a very valuable (and practical) application of DL!

Here are my notes: How big is the problem?

Each year stuff worth $1.2 billion is stolen!

Preventing retail theft is a billion+ industry!
Mar 27, 2023 β€’ 4 tweets β€’ 2 min read
One of my favorite applications of GPT-4:

πŸ’‘ asking it to rewrite what I wrote for clarity

1. I find a paragraph I don't feel is well written.
2. I ask GPT-4 to rewrite it: I don't lose agency in the process -- I am the one doing the writing.

Sometimes I edit what I get back (GPT-4 tends to be quite wordy).

And still, for most of my writing, I leave it as it is, if I like how it flows:
Mar 24, 2023 β€’ 10 tweets β€’ 4 min read
This talk between Jensen Huang and @ilyasut from two days ago is a masterclass on what makes ChatGPT tick!

And an awesome first-hand account of Deep Learning history.

@ilyasut shared 2 core ideas behind @OpenAI: 1. Unsupervised learning through compression πŸ’‘

In 2016 no one knew how to do unsupervised learning.

Turns out, if you want to compress a lot of data, you have to discover the secrets that live in it.

That is the key. This is why LLMs work so well.
Mar 23, 2023 β€’ 6 tweets β€’ 2 min read
I switched to vscode for running Jupyter Notebooks so that I can use @github Copilot and wow πŸ₯°

(that is coming from a vim/tmux/ jupyter notebook die-hard fan)

Here is my experience: You can easily ssh from vscode into a remote machine and run jupyter notebooks inside vscode.

This is how a notebook inside vscode looks like:
Mar 23, 2023 β€’ 6 tweets β€’ 4 min read
πŸ’‘ How to use ChatGPT in Jupyter Notebook

I created a cell magic that will allow you to talk to the OpenAI API directly from your notebook: Why am I doing this?

I want to integrate AI tools into my workflow. They are revolutionary.

And the best way to do so is via reducing FRICTION (having them available where I work).

Here are a couple more observations from this project and how to install it:
Mar 22, 2023 β€’ 15 tweets β€’ 8 min read
I love peering into the future and watching the GTC keynote is my favorite way to do so!

Here are my highlights from the keynote from a few hours ago.

Lots of amazing developments so buckle up and let's go for a ride! πŸ™‚ We begin with an interesting look at history.

AlexNet, the ImageNet winning model from 2012, had 61M parameters.

Trained on NVIDIA GPUs.

That was just 11 years ago 😱
Mar 17, 2023 β€’ 9 tweets β€’ 4 min read
Programming has changed forever 😳

Today I used GPT-4 for the first time and I can't imagine coding without it ever again.

Here is the project I worked on: Each day I spend 30 - 45 minutes reading about the defense Ukrainian heroes mount against the onslaught of the Russian terrorist state.

What if I could have an @OpenAI model read the tweets for me and summarize what happened?

Here is how I did this thanks to GPT-4:
Mar 3, 2023 β€’ 10 tweets β€’ 2 min read
An introduction to Recommender Systems

>> a thread 🧡 << What do Recommender Systems do?

βœ… they aim to predict user preferences
βœ… and make personalized recommendations

Here are the different techniques they use:
Dec 3, 2022 β€’ 9 tweets β€’ 7 min read
Working on a @fastdotai starter pack for the @kaggle RSNA competition...

AKA journey to the top of the LB! πŸš‚πŸšƒπŸšƒ

My favorite aspect of the experience so far...

"Look at all the code I DON'T HAVE to write!" πŸ˜„

Find what I have here, more soon: kaggle.com/code/radek1/fa… Image No need to fumble around with writing the boilerplate for a training loop for the 1000th time in my life...

I get to even define a (very) custom metric and have it aggregated the way I want!

This is fun πŸ™‚

And also makes the πŸ›€οΈ to the top of the LB... an express one πŸ˜„ ImageImage
Nov 17, 2022 β€’ 9 tweets β€’ 3 min read
Merlin Dataloader is 119x faster than my own PyTorch Dataset + Dataloader combo!

This is revolutionary for tabular data πŸ₯³

Let's take a closer look at what is going on. First, a disclaimer.

It is very hard to do benchmarking in a fair way.

I am comparing how *I would* do things in pure Python/PyTorch vs what Merlin Dataloader does for me.

Here is the setup:
Nov 9, 2022 β€’ 6 tweets β€’ 3 min read
3 ways to speed up your Python/pandas code by up to 10x that I learned from a recent @kaggle notebook: zip > itertuples

Itertuples is the fastest built-in method to iterate over a pandas DataFrame.

Using zip gives you an additional speed up.
Oct 29, 2022 β€’ 8 tweets β€’ 4 min read
Let me see if I can take notes on Twitter while working through the @fastdotai notebooks...

If it won't work, I'll just go take notes somewhere else, but let's give this a shot...

So what is CLIP? It is a model that

β€’ takes in a piece of text and an image
β€’ embeds both
β€’ compares both embeddings

It outputs a score that is higher the more these two embeddings are alike.
Oct 29, 2022 β€’ 4 tweets β€’ 2 min read
Courtesy of the weekend I am now caught up with @fastdotai lectures πŸ₯°

The course keeps getting better from iteration to iteration, this is unbelievable! πŸ™Œ

Now the plan is to construct a @fastdotai sandwich.

This is a representation of what I feel constitutes the homework πŸ™‚ ... or what counts to me as doing a @fastdotai course.

Often preparing a tasty sandwich takes many months πŸ™‚
Sep 26, 2022 β€’ 9 tweets β€’ 3 min read
3 surprisingly effective techniques for training Computer Vision models I used to win a @kaggle competition

Here is how you can apply them in your projects: 1. Use the one-cycle policy to control overfitting.

The one-cycle policy will allow you to perfectly control overfitting.

It gave me an edge in the competition as I trained on a small subset of clean data.

@fastdotai makes using the one-cycle policy super easy. Image
Sep 25, 2022 β€’ 7 tweets β€’ 3 min read
I am launching a new blog -- TabularMusings πŸ₯³

Here is the first blog post: tabularmusings.com/posts/feature-…

And here is the technology I am using and the reasons for starting the blog: First, the reasons.

A good friend of mine suggested my tabular data threads are helpful.

But it is a shame they disappear off the planet Earth so quickly -- having them in blog form will make them less ephemeral.

But there is more πŸ™‚
Sep 23, 2022 β€’ 11 tweets β€’ 3 min read
The most important Machine Learning skill:

πŸ‘‰ How to create a good validation set. πŸ‘ˆ

But most people make a couple of basic mistakes.

Read this, and I guarantee you won't be one of them:

>> a thread 🧡 << First of all, what makes a good validation set?

A validation set has one goal only -- to mimic what your model will encounter when making predictions.

That's it.

Here are the 3 most common mistakes people make:
Sep 21, 2022 β€’ 15 tweets β€’ 11 min read
I love peering into the future and watching the GTC keynote is my favorite way to do so!

Here are my highlights from the keynote from a few hours ago.

Lots of amazing developments so buckle up and let's go for a ride! πŸ™‚

1 of 15 ImageImage "Future games will not have pre-baked worlds. Future games will be simulations".

Lightning is raytraced. Objects are constructed from small components. It's physics all the way down.

The visuals are stunning and the game runs on a single GPU!

It all looks... real 😳

2 of 15 ImageImageImageImage
Sep 11, 2022 β€’ 5 tweets β€’ 3 min read
πŸ’‘ How to boost your productivity 2-5x by better structuring your pandas/cudf code

Here are 3 free resources that will set you up for success in your next ML project: 1. The main ingredient to writing readable, reusable code

The assign method

Take your understanding of it to the next level in this superb blog post by @__mharrison__

ponder.io/professional-p…
Sep 8, 2022 β€’ 7 tweets β€’ 4 min read
How to train a Merlin Two Tower model in a single notebook

1. Create data
2. Process data
3. Train the model
4. Perform offline batch prediction (top k recommendation)

Two Tower is a vital architecture.

Here is everything you need to train it on your data today: 1/6 Create Data

We create our own toy dataset.

This will make it easier to understand exactly what our model is doing.
Sep 5, 2022 β€’ 4 tweets β€’ 2 min read
An important and well-researched post by @math_rachel that just surfaced in my email.

πŸ‘‰ I now apply myself to work less precisely to increase my output

But there is one other aspect that most companies seem to miss:

medium.com/@racheltho/tec… Everyone complains that talent is hard to retain.
Stories of low engagement at tech companies make rounds on the Internet.

And yet you wouldn't believe how dedicated a person becomes to a workplace that treats them simply as a human would treat another human being.