Answer these questions
β What's your teams ML expertise?
β How much control/abstraction do you need?
β Would you like to handle the infrastructure components?
𧡠π
@SRobTweets created this pyramid to explain the idea.
As you move up the pyramid, less ML expertise is required, and you also donβt need to worry as much about the infrastructure behind your model.
@SRobTweets If youβre using Open source ML frameworks (#TensorFlow) to build the models, you get the flexibility of moving your workloads across different development & deployment environments. But, you need to manage all the infrastructure yourself for training & serving
@SRobTweets Deep Learning VMs provide managed, click-to-deploy VMs for processing data & training the model
πΉ Popular ML frameworks pre-installed
πΉ Reduces the overhead of managing & allocating compute & storage required
πΉ But you figure out how youβll serve those models
@SRobTweets Kubeflow - OS project for deploying ML workloads on #Kubernetes
πΉ Helps configure a multi-step ML pipeline including pre-processing data, training & serving the model
πΉ Run it on-premise or on any cloud
πΉ Youβll still need to configure where itβs managed
@SRobTweets AI Platform - managed service for all custom model needs
πΉ Includes tools for training & serving models, hosted notebooks, a data labeling service & more
πΉ Eg: take notebook code running on-premise with Kubeflow, and run it on GCP with AI Platform Notebooks
@SRobTweets BQML: Brings the power of ML closer to where the data is analyzed & make it accessible to data analysts
πΉ You donβt have to write any of the underlying model code
πΉ Choose model type
πΉ Simple SQL queries to create & train the model & make predictions
@SRobTweets AutoML democratizes ML to build custom ML models regardless of ML expertise.
πΉ Use the UI to upload the data - images, video, text, or structured
πΉ Press "train" button
πΉ Model is available for prediction via an API
πΉ No need to deploy it yourself
@SRobTweets ML APIs: Easiest and fastest way to get started with AI
πΉ Donβt need ML engineers or data scientists just some developers
πΉ Simple API request to pre-trained models for images, video, speech, text & translation
πΉ No need to supply any training data yourself
βοΈ How to deal with imbalanced datasets?βοΈ
Most real-world datasets are not perfectly balanced. If 90% of your dataset belongs to one class, & only 10% to the other, how can you prevent your model from predicting the majority class 90% of the time?
𧡠π
π±π±π±π±π±π±π±π±π±πΆ (90:10)
π³ π³ π³ π³ π³ π³ π³ π³ π³ β οΈ (90:10)
There can be many reasons for imbalanced data. First step is to see if it's possible to collect more data. If you're working with all the data that's available, these π techniques can help
Here are 3 techniques for addressing data imbalance. You can use just one of these or all of them together:
βοΈ Downsampling
βοΈ Upsampling
βοΈ Weighted classes
ML = Using data to answer questions!
π Using data = Training
π Answer questions = Predictions
Let's keep going... π§΅π
2/4 What are the 7 steps in Machine Learning?
1οΈβ£ Collect Data
2οΈβ£ Prepare Data
3οΈβ£ Choose a Model
4οΈβ£ Train the Model
5οΈβ£ Evaluate the Model
6οΈβ£ Parameter Tuning
7οΈβ£ Make Predictions
πQuantity & quality of your data dictate how accurate our model is
πThe outcome of this step is usually a table with some values (features)
π If you want to use pre-collected data - get it from sources such as Kaggle or BigQuery Public Datasets