The 4 stages of a machine learning project lifecycle:
1. Project scoping 2. Data definition and preparation 3. Model training and error analysis 4. Deployment, monitoring, and maintenance
Here are 29 questions that you can use at each step of the process.
↓
Project scoping
• What problem are we trying to solve?
• Why do we need to solve this problem?
• What are the constraints?
• What are the risks?
• What's the best approach to solving it?
• How do we measure progress?
• What does success look like?
Data definition and preparation
• What data do we need?
• How are we going to get it?
• How frequently does it change?
• Do we trust the source?
• How is this data biased?
• Can we improve it somehow?
• How are we going to clean it?
• How are we going to augment it?
Model training and error analysis
• What's a good baseline?
• What's a good starting point?
• Has anyone solved this before?
• How are we going to test the model?
• Are the results good enough?
• Are we solving the problem?
• How can we improve the results?
Deployment, monitoring, and maintenance
• Where do we host?
• How much do we need to scale?
• What metrics should we monitor?
• What results do we expect?
• How is the model doing compared to that?
• How do we keep the model up to date?
• What's our rollback strategy?
Every question opens a new set of possibilities, discoveries, and improvements.
The more you ask, the better your system will be.
Follow me @svpino, and I'll help you stay curious, one thread at a time, right on your Twitter timeline.
I post every single day.
If you build machine learning systems professionally, what questions would you recommend others to start asking?
What questions lead to interesting discoveries with the potential to change the outcome of the project?
• Improve as a developer
• Improve your communication
• Take a course. Take another. Repeat.
• Solve problems. Many of them.
• Teach others.
• Analysis first. Code is secondary.
• Stay curious.
“Tutorial hell” is only when you focus on consumption and neglect production.