Latest Twitter Threads by @MLsummaries on Thread Reader App

Jun 7, 2021 • 6 tweets • 3 min read

There are way too many papers on #MachineLearning and on #DeepLearning these days. How to choose which papers to read? A tiny thread 🧵 The first one is our absolute favorite. Arxiv sanity by none other than @karpathy!

Link: arxiv-sanity.com

May 31, 2021 • 7 tweets • 4 min read

"Attention is all you need" is one of the most cited papers in last couple of years. What is attention? Let's try to understand in this thread.

Paper link: arxiv.org/abs/1706.03762

#DeepLearning #MachineLearning #Transformers

In Self-attention mechanism, we are updating the features of a given point, with respect to other features. The attention proposed in this paper is also known as Scaled dot-product attention.

Apr 7, 2021 • 8 tweets • 2 min read

This paper shares 56 stories of researchers in Computer Vision, young and old, scientists and engineers. Reading it was a cocktail of emotions as you simultaneously relate to the stories of joy,excitement,cynicism,and fear. Give it a read!

#ComputerVision

https://twitter.com/normsu/status/1377253149627580418?s=20

Some quotes from the stories - it was a "tough and hopeless time" in computer vision "before 2012, [when] the annual performance improvements over ImageNet are quite marginal."

Apr 7, 2021 • 9 tweets • 4 min read

Today we will summarize Vision Transformer (ViT) from Google. Inspired by BERT, they have implemented the same architecture for image classification tasks.

Link: arxiv.org/abs/2010.11929
Code: github.com/google-researc…

#MachineLearning #DeepLearning

The authors have taken the Bert architecture and applied it on an images with minimal changes.Since the compute increases with the length of the sequence, instead of taking each pixel as a word, they propose to split the image into some ’N’ patches and take each of them as token.

Apr 6, 2021 • 5 tweets • 3 min read

Depending on the problem we are trying to solve, the loss function varies. Today we are going to learn about Triplet losses. You must have heard about it while reading about Siamese networks.

#MachineLearning #DeepLearning #RepresentationLearning

Triplet loss is an important loss function for learning a good “representation”. What’s a representation you ask? Finding similarity (or difference) between two images is hard if you just use pixels.

Mar 29, 2021 • 7 tweets • 3 min read

To get the intuition behind the Machine Learning algorithms, we need to have some background in Math, especially Linear Algebra, Probability & Calculus. Consolidating a few cheat-sheets here. A thread 👇

For Linear Algebra: Topics include Vector spaces, Matrix vector operations, Rank of a matrix, Norms, Eigenvectors and values and a bit of Matrix calculus too.

souravsengupta.com/cds2016/lectur…

(Advanced) cs229.stanford.edu/section/cs229-…

Share this page!

Enter URL or ID to Unroll