Jean de Nyandwi Profile picture
Vision 🤍 Language • Multimodal AI Research @ CMU • AI Education Research blog: https://t.co/1BEFLZAqe7 ML Pack: https://t.co/7PkTyDvuri
2 subscribers
Mar 14, 2023 9 tweets 4 min read
GPT-4 is finally out. As many people alluded to lately, GPT-4 is multimodal. It can take images and texts. It can answer questions about images, converse back & forth, etc. GPT-4 is essentially better GPT-3.5 + vision.

Blog: openai.com/research/gpt-4
Paper: cdn.openai.com/papers/gpt-4.p… Some interesting things about GPT-4

GPT-4 exhibits human-level performance on a range of language and vision academic benchmarks and shows excellent performance in professional exams.

Below is GPT-4 performance compared to GPT-3.5 & GPT-4(no vision)>> vision improves language!? Image
Dec 13, 2022 5 tweets 3 min read
Graph Machine Learning Resources

The world is interconnected and graphs are everywhere. Deep learning is increasingly being applied in processing and analyzing graphs and alike datasets.

Here are some of the best resources for people interested in graph machine learning ⬇️⬇️⬇️ Geometric Deep Learning - AMMI

This is one of the best graph ML courses that cover a wide range of topics. Its 2nd version was recently released. Taught by M. Bronstein, P. Veličković, T. Cohen, and J. Bruna, all of whom are seminal figures in the field.

youtube.com/playlist?list=…
Aug 5, 2022 4 tweets 2 min read
Exploring Plain Vision Transformer Backbones for Object Detection

Introduces ViTDet, a simple plain/non-hierarchical Vision Transformer backbone for object detection. Achieves 61.3 box AP on COCO. Completely decouples backbone design & downstream tasks.

arxiv.org/abs/2203.16527 Image Most vision transformer-based backbones such as Swin Transformer are hierarchical(channels & resolutions increase and decrease with depth respectively), they mimic ConvNets.

ViTDet takes a different approach: keep the isotropic structure of ViT and add a simple feature pyramid. Image
Jul 3, 2022 5 tweets 3 min read
Efficient Transformers: A Survey

There have been many attempts to improve the efficiency of Transformer which resulted in zillions of xformers but, most people still use vanilla Transformer.

Here is a comprehensive walkthrough of efficient Transformers

arxiv.org/abs/2009.06732 Image A standard Transformer works great, but its core component(self-attention) has quadratic time complexity, something that is okay for short sequences(in tasks like NMT) but not good for large sequences.

The main point of improving Transfomer is thus reducing such time complexity. Image
Jun 22, 2022 14 tweets 11 min read
Things are changing so fast. Long ago, to classify images, you typically had to extract features using traditional image processing techniques, and then feed them to a linear/SVM classifier.

The progress and ongoing unification of visual recognition architectures 🧵🧵 To detect objects, you had to generate thousands of object regions using region proposals and then have an SMV classifier & regressor that classify & predict box coordinates respectively.
May 25, 2022 16 tweets 13 min read
MLOps or Machine Leaning Operations is one of the most important things to be studying these days. Building models is one thing. Making models useful is another thing.

The reason why you need to study MLOps and the 4 learning resources that you will ever need 🧵🧵 Today, we can all agree that model building is no longer the most challenging part of a machine learning project. Over the past decades, there has been massive development in models.

In most problems, it's not even required to build your own models nor recommended.
May 2, 2022 8 tweets 4 min read
You might not believe it, but the following 6 machine learning books are fully free:

- Deep Learning
- Dive into Deep Learning
- Machine Learning Engineering
- Python Data Science Handbook
- Probabilistic Machine Learning
- Machine Learning Yearning

Here are their links 🧵 1. Deep Learning by @goodfellow_ian et al.

A nice book for machine learning and deep learning foundations

deeplearningbook.org
Apr 15, 2022 14 tweets 4 min read
The all-time 4 papers that revolutionalized deep learning, computer vision, and NLP in the last decade 🧵🧵 The decade 2010s is inarguably the host of some of the most important breakthroughs in deep learning algorithms and applications.
Mar 22, 2022 16 tweets 3 min read
The single most difficulty in designing deep learning networks and how to tackle it⬇️⬇️⬇️ One of the biggest challenges in designing neural networks architectures is determining the right size of the network.

Make the network small, it fails to capture the representation in the data. It underfits.
Mar 21, 2022 6 tweets 1 min read
Machine learning is data-driven programming. Ordinary programming is driven by hand-coded rules.

The former heavily relies on data. Hence, better & bigger data leads to better machine learning models. In essence, machine learning is using the past behavior or history of a particular event to predict its future.

For example: given the record of this user, can you predict if he will keep using/leave our service/platform so we can improve his/her experience? This is churn.
Mar 8, 2022 22 tweets 7 min read
Some of the biggest inventions of the past decade (2010s) in the deep learning community:

- Deep convolutional neural networks(led by AlexNet)
- Batch normalization
- Residual networks
- Transformers

Today, Deep learning revolves around those inventions. Let's talk about them🧵 AlexNet is one of the first deep convolution neural networks that showed excellent performance on image recognition tasks.

AlexNet had 9 layers and it was the deepest network at the time(2012). Ever since then, CNNs became deeper and deeper. 100, 150, 500, 1000 ++ layers.
Feb 10, 2022 13 tweets 4 min read
Most deep learning models sell performance accuracy, parameter counts, and floating points operations but one of the most important things when it comes to real-world applications is the latency or the speed of the model.

A thread on why and where latency matters 🧵🧵 Since the AlexNet moment, the first deep network that demonstrated state-of-the-art results in visual recognition tasks, deep networks got bigger and bigger because bigger networks work better. This is inarguable.

proceedings.neurips.cc/paper/2012/fil…
Jan 11, 2022 19 tweets 5 min read
In the early days of learning any particular skill, structured courses are extremely important for providing the foundations.

For machine learning, the most foundation rich courses are:

- Machine Learning Stanford
- Deep Learning Specialization Very likely, these courses will be very tough for you if you are getting started.

It's okay if you don't understand anything at the moment. You just have to resist the temptations of quitting because you don't understand everything. If you understand it, you forget them anyways.
Jan 8, 2022 11 tweets 4 min read
Early last year, I wanted to learn about Machine Learning Operations(MLOps).

MLOps refers to the whole processes involved in building and deploying machine learning models reliably.

A thread on the importance of MLOps and resources that I used 🧵 As you may have heard, models are a tiny part of any typical ML-powered application.

There is nothing that stresses that as this picture:

Source: Hidden Technical Debt in Machine Learning Systems, proceedings.neurips.cc/paper/2015/fil…
Jan 4, 2022 4 tweets 2 min read
Today, you can get started with data science with Python Data Science Handbook.

It is beginner-friendly and an intensive book that covers the fundamentals of data science and tools such as NumPy, Matplotlib, Seaborn, and Scikit-Learn.

Free on the web.

jakevdp.github.io/PythonDataScie… Thanks to the author @jakevdp for making it free to read the book on the web.
Dec 20, 2021 18 tweets 6 min read
ML Weekly HighLights ⚡️

From me:
- Types of ML systems
- What remains misunderstood about batch normalization and internal covariate shift

From the community:
- Attention is all you need implementations
- Transformers learning resources
- OpenAI WebGPT
- DeepMind Arnheim

🧵 This past week, I wrote about different types of machine learning systems that are:

- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Self-supervised learning
- Reinforcement learning

Dec 15, 2021 31 tweets 5 min read
The following are 5 main types of machine learning systems based on the level of supervision involved in the learning process:

◆Supervised learning
◆Unsupervised learning
◆Semi-supervised learning
◆Self-supervised learning
◆Reinforcement learning

Let's talk about them...🧵 1. Supervised learning

This is the common most type of machine learning. Most ML problems that we encounter fall into this category.

As the name implies, a supervised learning algorithm is trained with input data along with some form of guidance that we can call labels.
Dec 12, 2021 25 tweets 6 min read
ML Weekly HighLights ⚡️

From me:
◆How to learn a hard subject
◆AlexNet ConvNet architecture

From the community:
◆Building models like building open source software
◆Software 2.0
◆DeepMind new 280 billion parameters language model This week, I wrote about how to learn a difficult subject and shared the implementation of AlexNet architecture.

The motivation to post about learning a hard subject came after I was preparing for an academic test of one of my difficult courses.

Dec 7, 2021 4 tweets 1 min read
How to learn a hard subject:

◆Find all relevant materials/resources (the fewer the better)
◆Get a high-level understanding of the subject
◆Go deep into the materials but slowly by slowly. Divide the topics into chunks/slices and only eat one slice at a time. ◆Pause in between sessions
◆Try to recall what you learned! Note down what you remember, and review what you don't remember. This is where actual learning happens.
Dec 5, 2021 9 tweets 4 min read
ML Weekly HighLights 💡

From me:
◆How to think about precision and recall
◆The ins and out of Convolutional Neural Networks

From the community:
◆Arxiv sanity lite
◆Dive Into Deep Learning Book on Sagemaker
◆60 years of research
◆GauGAN AI Art Demo: Paint me a picture

🧵 This past week, I wrote an in-depth thread about convolutional neural networks.

Here is the thread!

Dec 3, 2021 7 tweets 2 min read
Hi,

The number of layers depends on the size of the dataset, but there is no way to know the right number of layers although it's somewhere between 1-10 for an initial trial at least. Everything in deep networks is not clearly predefined. It's all experimenting, experimenting, and experimenting.

ReLU activation is a good starting point always. You can later try other non-saturating activations like SeLU, ELU, etc, but avoid using sigmoid or Tanh.