Latest Twitter Threads by @haltakov on Thread Reader App

Jul 5, 2022 • 20 tweets • 5 min read

Zero-Knowledge Proofs 0️⃣📘

How can I prove to you that I know a secret, without revealing any information about the secret itself?

This is called a zero-knowledge proof and it is a super interesting area of cryptography! But how does it work?

Thread 🧵 Let's start with an example

Peggie and Victor travel between cities A and B. There are two paths - a long path and a short path. The problem is that there is a gate on the short path for which you need a password.

Peggie knows the password, but Victor doesn't.

👇

Mar 30, 2022 • 6 tweets • 8 min read

Launching a charity project for Ukraine 🇺🇦

Me and @ianbydesign teamed up to build @RescueToadz - an NFT collection raising funds for humanitarian aid via @Unchainfund. Many thanks to @cryptoadzNFT for the support!

rescuetoadz.xyz

It's unlike any other NFT, though👇 @ianbydesign @RescueToadz @Unchainfund @cryptoadzNFT Trustless

Rescue Toadz looks like a regular NFT collection at first - you can mint a toad and you get an NFT in your wallet.

100% of the mint fee is directly sent to @Unchainfund - an organization that provides humanitarian aid to Ukraine and that has already raised $9M!

👇

Mar 25, 2022 • 14 tweets • 4 min read

Dealing with imbalanced datasets 🐁 ⚖️ 🐘

Real world datasets are often imbalanced - some of the classes appear much more often than others.

The problem? You ML model will likely learn to only predict the dominant classes.

What can you do about it? 🤔

Thread 🧵 #RepostFriday Example 🚦

We will be dealing with an ML model to detect traffic lights for a self-driving car 🤖🚗

Traffic lights are small so you will have much more parts of the image that are not traffic lights.

Furthermore, yellow lights 🟡 are much rarer than green 🟢 or red 🔴.

Mar 22, 2022 • 16 tweets • 7 min read

Machine Learning Explained 👨‍🏫

PCA

Principal Component Analysis is a commonly used method for dimensionality reduction.

It's a good example of how fairly complex math can have an intuitive explanation and be easy to use in practice.

Let's start from the application of PCA 👇

Dimensionality Reduction

This is one of the common uses of PCA in machine learning.

Imagine you want to predict house prices. You get a large table of many houses and different features for them like size, number of rooms, location, age, etc.

Some features seem correlated 👇

Mar 18, 2022 • 8 tweets • 4 min read

s this formula difficult? 🤔

This is the formula for Gradient Descent with Momentum as presented in Wikipedia.

It may look intimidating at first, but I promise you that by the end of this thread it will be easy to understand!

Thread 👇

#RepostFriday

The Basis ◻️

Let's break it down! The basis is this simple formula describing an iterative optimization method.

We have some weights (parameters) and we iteratively update them in some way to reach a goal

Iterative methods are used when we cannot compute the solution directly

Mar 16, 2022 • 20 tweets • 4 min read

Machine Learning Formulas Explained 👨‍🏫

For regression problems you can use one of several loss functions:
▪️ MSE
▪️ MAE
▪️ Huber loss

But which one is best? When should you prefer one instead of the other?

Thread 🧵

Let's first quickly recap what each of the loss functions does. After that, we can compare them and see the differences based on some examples.

👇

Mar 11, 2022 • 16 tweets • 5 min read

Machine Learning in the Real World 🧠 🤖

ML for real-world applications is much more than designing fancy networks and fine-tuning parameters.

In fact, you will spend most of your time curating a good dataset.

Let's go through the process together 👇

#RepostFriday

Collect Data 💽

We need to represent the real world as accurately as possible. If some situations are underrepresented we are introducing Sampling Bias.

Sampling Bias is nasty because we'll have high test accuracy, but our model will perform badly when deployed.

👇

Mar 8, 2022 • 13 tweets • 5 min read

Machine Learning Formulas Explained 👨‍🏫

This is the Huber loss - another complicated-looking formula...

Yet again, if you break it down and understand the individual, it becomes really easy.

Let me show you 👇

Background

The Huber loss is a loss function that is similar to the Mean Squared Error (MSE) but it is designed to be more robust to outliers.

MSE suffers from the problem that if there is a small number of severe outliers they can dominate the whole loss

How does it work? 👇

Mar 4, 2022 • 13 tweets • 5 min read

Machine Learning Formulas Explained! 👨‍🏫

This is the formula for the Binary Cross Entropy Loss. It is commonly used for binary classification problems.

It may look super confusing, but I promise you that it is actually quite simple!

Let's go step by step 👇

#RepostFriday

The Cross-Entropy Loss function is one of the most used losses for classification problems. It tells us how well a machine learning model classifies a dataset compared to the ground truth labels.

The Binary Cross-Entropy Loss is a special case when we have only 2 classes.

👇

Mar 3, 2022 • 11 tweets • 7 min read

When machine learning met crypto art... they fell in love ❤️

The Decentralized Autonomous Artist (DAA) is a concept that is uniquely enabled by these technologies.

Meet my favorite DAA - Botto.

Let me tell you how it works 👇

Botto uses a popular technique to create images - VQGAN+CLIP

In simple terms, it uses a neural network model generating images (VQCAN) guided by the powerful CLIP model which can relate images to text.

This method can create stunning visuals from a simple text prompt!

👇

Feb 25, 2022 • 20 tweets • 7 min read

There are two problems with ROC curves

❌ They don't work for imbalanced datasets
❌ They don't work for object detection problems

So what do we do to evaluate our machine learning models properly in these cases?

We use a Precision-Recall curve.

Thread 👇

#RepostFriday

Last week I wrote another detailed thread on ROC curves. I recommend that you read it first if you don't know what they are.

https://twitter.com/haltakov/status/1438206936680386560

Then go on 👇

Feb 24, 2022 • 11 tweets • 4 min read

Is your machine learning model performing well? What about in 6 months? 🤔

If you are wondering why I'm asking this, you need to learn about 𝗰𝗼𝗻𝗰𝗲𝗽𝘁 𝗱𝗿𝗶𝗳𝘁 and 𝗱𝗮𝘁𝗮 𝗱𝗿𝗶𝗳𝘁.

Let me explain this to you using two real world examples.

Thread 👇 Imagine you are developing a model for a self-driving car to detect other vehicles at night.

Well, this is not too difficult, since vehicles have two red tail lights and it is easy to get a lot of data. You model works great!

But then... 👇

Feb 22, 2022 • 9 tweets • 5 min read

Math is not very important when you are using a machine learning method to solve your problem.

Everybody that disagrees, should study the 92-page appendix of the Self-normalizing networks (SNN) paper, before using
torch.nn.SELU.

And the core idea of SNN is actually simple 👇

SNNs use an activation function called Scaled Exponential Linear Unit (SELU) that is pretty simple to define.

It has the advantage that the activations converge to zero mean and unit variance, which allows training of deeper networks and employing strong regularization.

👇

Feb 21, 2022 • 10 tweets • 3 min read

This is like an NFT in the physical world

This is a special edition BMW 8 series painted by the famous artist Jeff Koons. A limited-edition of 99 with a price of $350K - about $200K more than the regular M850i.

If you think about it, you'll see many similarities with NFTs

👇

Artificially scarce

BMW can surely produce (mint 😅) more than 99 cars with this paint. The collection size is limited artificially in order to make it more exclusive.

Same as most NFT collections - they create artificial scarcity.

👇

Feb 18, 2022 • 19 tweets • 6 min read

Did you ever want to learn how to read ROC curves? 📈🤔

This is something you will encounter a lot when analyzing the performance of machine learning models.

Let me help you understand them 👇

#RepostFriday

What does ROC mean?

ROC stands for Receiver Operating Characteristic but just forget about it. This is a military term from the 1940s and doesn't make much sense today.

Think about these curves as True Positive Rate vs. False Positive Rate plots.

Now, let's dive in 👇

Feb 17, 2022 • 12 tweets • 2 min read

It sucks if your ML model can't achieve good performance, but it is even worse if you don't know it!

Sometimes you follow all the best practices and your experiments show your model performing very well, but it fails when deployed.

A thread about Sampling Bias 👇 There is a lot of information about rules you need to follow when evaluating your machine learning model:

▪️ Balance your dataset
▪️ Use the right metric
▪️ Use high-quality labels
▪️ Split your training and test data
▪️ Perform cross-validation

But this may not be enough 👇

Jan 18, 2022 • 15 tweets • 4 min read

The Internet is already decentralized, why do we need web3? 🤔

This is a common critique of web3. However, decentralization on its own is not always enough - sometimes we need to agree on a set of facts.

Blockchains give us a consensus mechanism for that!

Thread 🧵

1/12 The Internet is built of servers that communicate using open protocols like HTTP, SMTP, WebRTC etc. Everybody can set up a server and participate. It is decentralized!

However, if two servers distribute contradicting information, how do you know which one is right?

2/12

Jan 18, 2022 • 9 tweets • 4 min read

How decentralized is web3 really?

While there is a lot of hype around web3, NFTs, and decentralized apps (dApps), there is also a lot of criticism. Today, I'll focus on the critique that web3 is actually too centralized.

Let's try to have an honest discussion 👇 These are the main arguments I see regularly. Please add more in the comments.

1️⃣ The Internet is already decentralized
2️⃣ It is inefficient
3️⃣ Everything can be implemented better using a centralized approach
4️⃣ Important services are centralized

👇

Jan 17, 2022 • 7 tweets • 5 min read

How many parameters do you need in your neural network to solve any problem? 🤔

GPT-3 has 175 billion, MT-NLG has 530 billion and Wu Dao has 1.75 trillion.

But the truth is you only need 1 parameter. No, not 1 billion. Just a single parameter!

Let me explain 👇 Yes, of course, I'm trolling you, but only a little bit 😁

I want to show you this very cool work by @ranlot75 about how to fit an arbitrary dataset with a single parameter and the following function

github.com/Ranlot/single-…

👇

Dec 29, 2021 • 26 tweets • 10 min read

You think you know what is an NFT? Well, think again...

You are doing it wrong if you think about NFTs as pixelated images of punks, toads, or apes. It is not about the JPEG!

A better mental model for thinking about NFTs 👇 Forget the images for now. Owning an NFT means that your wallet address is listed as the owner of a specific digital asset on the blockchain.

Digital assets are organized in collections and an NFT is one specific piece of this collection.

Let's look at an example 👇

Dec 21, 2021 • 5 tweets • 3 min read

Things are getting more and more interesting for AI-generated images! 🎨

GLIDE is a new model by @OpenAI that can generate images guided by a text prompt. It is based on a diffusion model instead of the more widely used GAN models.

Some details 👇

@OpenAI GLIDE also has the interesting ability to perform inpainting allowing for some interesting usages.

👇

Share this page!

Enter URL or ID to Unroll