Jean de Nyandwi Profile picture
Oct 18, 2021 12 tweets 4 min read Read on X
Kaggle's 2021 State of Data Science and Machine Learning survey was released a few days ago.

If you didn't see it, here are some important takeaways 🧵
Top 5 IDEs

1. Jupyter Notebook
2. Visual Studio Code
3. JupyterLab
4. PyCharm
5. RStudio
ML Algorithms Usage: Top 10

1. Linear/logistic regression
2. Decision trees/random forests
3. Gradient boosting machines(Xgboost, LightGBM)
5. Convnets
6. Bayesian approaches
7. Dense neural networks(MLPs)
8. Recurrent neural networks(RNNs)
9. Transformers(BERT, GPT-3)
10. GANs
Machine Learning Tools Landscape - Top 8

1. Scikit-Learn
2. TensorFlow(tf.keras included)
3. XGBoost
4. Keras
5. PyTorch
6. LightGBM
7. CatBoost
8. Huggingface🤗
Cloud Computing Tools - Top 3

1. AWS
2. GCP
3. Microsoft Azure
Enteprise ML Tools - Top 5

1. Amazon SageMaker
2. DataBricks
3. Asure ML Studio
4. Google Cloud Vertex AI
5. DataRobot

Notes: If you look at the graph, it seems that over half the number of the survey responders don't use those kinds of tools.
Databases - Top 4

1. MySQL
2. PostgreSQL
3. Microsoft SQL Server
4. MongoDB
Machine Learning Experimentation Tools - Top 4

1. TensorBoard
2. MLflow
3. Weights & Biases
4. Neptune.ai

Notes: Looking at the graph, the majority of Kagglers do not track their ML models. All eye on the leaderboard!
AutoML Tools - Top 5

1. Google Cloud AutoML
2. Azure Automated ML
3. Amazon SageMaker Autopilot
4. H20 Driverless AI
5. Databricks AutoML
CONCLUSIONS:

1. Notebooks are still the most appreciated way of experimenting with ML. If you never did it, try them in VSCode.
2. Scikit-Learn is ahead of the game
3. All you need is XGBoost(CC: @tunguz)
4. No need for model tracking on Kaggle. There is a leaderboard
If you would like to read the whole survey, here is the link:

kaggle.com/kaggle-survey-…
Thanks for reading.

For more machine learning content, follow @Jeande_d.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jean de Nyandwi

Jean de Nyandwi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Jeande_d

Mar 14, 2023
GPT-4 is finally out. As many people alluded to lately, GPT-4 is multimodal. It can take images and texts. It can answer questions about images, converse back & forth, etc. GPT-4 is essentially better GPT-3.5 + vision.

Blog: openai.com/research/gpt-4
Paper: cdn.openai.com/papers/gpt-4.p…
Some interesting things about GPT-4

GPT-4 exhibits human-level performance on a range of language and vision academic benchmarks and shows excellent performance in professional exams.

Below is GPT-4 performance compared to GPT-3.5 & GPT-4(no vision)>> vision improves language!? Image
GPT-4 demonstrates remarkable few-shot(3-shots) accuracy on MMLU(Multi‑task Language Understanding). In particular, GPT-4 results on low-resourceful languages are very impressive(some better than GPT-3.5 on English). Image
Read 9 tweets
Dec 13, 2022
Graph Machine Learning Resources

The world is interconnected and graphs are everywhere. Deep learning is increasingly being applied in processing and analyzing graphs and alike datasets.

Here are some of the best resources for people interested in graph machine learning ⬇️⬇️⬇️
Geometric Deep Learning - AMMI

This is one of the best graph ML courses that cover a wide range of topics. Its 2nd version was recently released. Taught by M. Bronstein, P. Veličković, T. Cohen, and J. Bruna, all of whom are seminal figures in the field.

youtube.com/playlist?list=…
Machine Learning with Graphs - Stanford CS224W

CS224W is also a great graph ML course that covers a number of topics in the field in byte-sized lectures, 60 lectures in total.

youtube.com/playlist?list=…
Read 5 tweets
Aug 5, 2022
Exploring Plain Vision Transformer Backbones for Object Detection

Introduces ViTDet, a simple plain/non-hierarchical Vision Transformer backbone for object detection. Achieves 61.3 box AP on COCO. Completely decouples backbone design & downstream tasks.

arxiv.org/abs/2203.16527 Image
Most vision transformer-based backbones such as Swin Transformer are hierarchical(channels & resolutions increase and decrease with depth respectively), they mimic ConvNets.

ViTDet takes a different approach: keep the isotropic structure of ViT and add a simple feature pyramid. Image
The 2 main highlights of ViTDet:

- Uses window self-attention but stays away from shifted-windows
- Separates pretraining & finetuning. This feat can make it possible to use self-supervised pretraining. The best results are indeed achieved by MAE(masked autoencoder) pretraining.
Read 4 tweets
Jul 3, 2022
Efficient Transformers: A Survey

There have been many attempts to improve the efficiency of Transformer which resulted in zillions of xformers but, most people still use vanilla Transformer.

Here is a comprehensive walkthrough of efficient Transformers

arxiv.org/abs/2009.06732 Image
A standard Transformer works great, but its core component(self-attention) has quadratic time complexity, something that is okay for short sequences(in tasks like NMT) but not good for large sequences.

The main point of improving Transfomer is thus reducing such time complexity. Image
Despite having many xformers that claim to have a linear time complexity(or close), "it's still a misery to know which fundamental efficient Transformer block one should consider using."

Indeed, most language & vision applications are still dominated by standard Transformer. ImageImage
Read 5 tweets
Jun 22, 2022
Things are changing so fast. Long ago, to classify images, you typically had to extract features using traditional image processing techniques, and then feed them to a linear/SVM classifier.

The progress and ongoing unification of visual recognition architectures 🧵🧵
To detect objects, you had to generate thousands of object regions using region proposals and then have an SMV classifier & regressor that classify & predict box coordinates respectively.
After the 2012 AlexNet moment, things started to change. All you needed to do image classification was ConvNets. No more traditional feature engineering techniques.
Read 14 tweets
May 25, 2022
MLOps or Machine Leaning Operations is one of the most important things to be studying these days. Building models is one thing. Making models useful is another thing.

The reason why you need to study MLOps and the 4 learning resources that you will ever need 🧵🧵
Today, we can all agree that model building is no longer the most challenging part of a machine learning project. Over the past decades, there has been massive development in models.

In most problems, it's not even required to build your own models nor recommended.
If models are no longer the bottleneck, what's important then and what does this have to do with MLOps that we started this thread talking about?

Without bells and whistles, let's see why MLOps is important and the resources that you can use to learn it.
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(