Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

ML TLDR

@MLsummaries

Apr 6, 2021 • 5 tweets • 3 min read • Read on X

Scrolly

Depending on the problem we are trying to solve, the loss function varies. Today we are going to learn about Triplet losses. You must have heard about it while reading about Siamese networks.

#MachineLearning #DeepLearning #RepresentationLearning

Triplet loss is an important loss function for learning a good “representation”. What’s a representation you ask? Finding similarity (or difference) between two images is hard if you just use pixels.

So what do we do about it - given three images cat1, cat2, dog, we use a neural network to map the images to vectors f(cat1), f(cat2), and f(dog).

Now as the name suggests, triplet loss takes 3 inputs, and tries to minimize the distance between cat images, and maximize the distance between cat and dog images. That’s it!

@ylecun

That’s it. See this image from Bromley, Guyon and @ylecun ’s 1994 paper - “Signature Verification using a Siamese Time Delay Neural Network”. Do you notice the parallels?

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @MLsummaries

ML TLDR

@MLsummaries

Jun 7, 2021

There are way too many papers on #MachineLearning and on #DeepLearning these days. How to choose which papers to read? A tiny thread 🧵

@karpathy

The first one is our absolute favorite. Arxiv sanity by none other than @karpathy!

Link: arxiv-sanity.com

@labmlai

The second one is by @labmlai. It is pretty new and the interface is pretty smooth!

Link: papers.labml.ai/papers/recent/

Read 6 tweets

ML TLDR

@MLsummaries

May 31, 2021

"Attention is all you need" is one of the most cited papers in last couple of years. What is attention? Let's try to understand in this thread.

Paper link: arxiv.org/abs/1706.03762

#DeepLearning #MachineLearning #Transformers

In Self-attention mechanism, we are updating the features of a given point, with respect to other features. The attention proposed in this paper is also known as Scaled dot-product attention.

Lets say, our data point is a single sentence, we embed each word into some d-dimensional space, so we compute how each point is similar to each other point, and weigh its representation accordingly. The similarity matrix is just a scaled dot product!

Read 7 tweets

ML TLDR

@MLsummaries

Apr 7, 2021

https://twitter.com/normsu/status/1377253149627580418?s=20

This paper shares 56 stories of researchers in Computer Vision, young and old, scientists and engineers. Reading it was a cocktail of emotions as you simultaneously relate to the stories of joy,excitement,cynicism,and fear. Give it a read!

#ComputerVision

https://twitter.com/normsu/status/1377253149627580418?s=20

Some quotes from the stories - it was a "tough and hopeless time" in computer vision "before 2012, [when] the annual performance improvements over ImageNet are quite marginal."

"she told me you should solve the problem purely based on deep learning... I did not think the occlusion problem can be solved without explicitly reasoning of shape priors and depth ordering"

Read 8 tweets

ML TLDR

@MLsummaries

Apr 7, 2021

Today we will summarize Vision Transformer (ViT) from Google. Inspired by BERT, they have implemented the same architecture for image classification tasks.

Link: arxiv.org/abs/2010.11929
Code: github.com/google-researc…

#MachineLearning #DeepLearning

The authors have taken the Bert architecture and applied it on an images with minimal changes.Since the compute increases with the length of the sequence, instead of taking each pixel as a word, they propose to split the image into some ’N’ patches and take each of them as token.

So first take each patch, flatten it (which will be of length P²C), and project it linearly to dimension D. And in the 0th position add a ‘D’ dimensional embedding which will be learnt. Add positional encoding to these embedding.

Read 9 tweets

ML TLDR

@MLsummaries

Mar 29, 2021

To get the intuition behind the Machine Learning algorithms, we need to have some background in Math, especially Linear Algebra, Probability & Calculus. Consolidating a few cheat-sheets here. A thread 👇

For Linear Algebra: Topics include Vector spaces, Matrix vector operations, Rank of a matrix, Norms, Eigenvectors and values and a bit of Matrix calculus too.

souravsengupta.com/cds2016/lectur…

(Advanced) cs229.stanford.edu/section/cs229-…

For Probability & Statistics: Random variables, expectation, Probability distributions and so on.

stanford.edu/~shervine/teac…

stanford.edu/~shervine/teac…

(Advanced) cs229.stanford.edu/section/cs229-…

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

ML TLDR

Try unrolling a thread yourself!

More from @MLsummaries

ML TLDR

ML TLDR

ML TLDR

ML TLDR

ML TLDR

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!