Most Data Strategies are missing a critical component. It's a Data Monetization Catalog, and they are not difficult to build. Here's my process: 1/8 #DataScience#MachineLearning#Data#Strategy
The process starts with the question, what use cases is this data used for? Use cases have business value, and it's a straight-line connection. 2/8 #DataScience#MachineLearning#Data#Strategy
I walk clients through this exercise, and it reveals excellent insights because data catalogs and dictionaries are connected to technical use cases but rarely to business use cases.
1. Data teams spend a lot of their time working on low-value datasets.
2. Most datasets are underutilized, meaning they can improve analytics and model quality for existing use cases but aren't. 4/8 #DataScience#MachineLearning#Data#Strategy
3. The business has a lot of data that is no longer being used or should no longer be used. Those datasets have reached the end of life and should be archived. 5/8 #DataScience#MachineLearning#Data#Strategy
4. Many data use cases are underserved by their current datasets and need new data sources.
The Data Monetization Catalog connects data's technical value and business value. It's an alignment tool that D&A orgs can use to keep their work and prioritization frameworks connected with the rest of the business. 7/8 #DataScience#MachineLearning#Data#Strategy
I wrote up a detailed process with resources that will teach you how to build a solid valuation for your data. It's a subscriber only post so sign up today.
If you're a Data Scientist who wants to be a better developer or builder, here's a thread on how to do it. There's so much bad advice out there, and I hope this helps clear things up. 1/8 #DataScience#MachineLearning#Programming
1. Spend a year coding as part of a team. Have people review your code and participate in code reviews. This will help you unlearn many bad habits. You'll also get exposure to different styles and best practices. 2/8 #DataScience#MachineLearning#Programming
2. Build traditional software engineering type projects. Services and Web Apps are great because you'll learn fundamental coding skills.
Data Science introduces a new model or architecture weekly, and it can be tough to keep up. Here are some of the basics and recent releases with resources to help you quickly understand each one.
1/15 #DataScience#MachineLearning#DeepLearning
Let's start with DALL E2. Here's a python implementation. Sometimes the easiest way to learn about it is to use it.
Google recently released an overview of PaLM. It's one of a growing list of large scale language models improving on the capabilities of earlier models like GPT-3. Deep learning is going big.
The Data Science learning path today is different than it was 3 years ago and looks nothing like it did 7 years ago. This thread has the main layers and example resources covering the basics, assuming you've got basic math covered.
1/18 #DataScience#MachineLearning
1. Research Methods. We do a lot of research and experimentation now. Data Scientists used to be model-centric but that's changed because our work must meet higher reliability requirements. I wrote an intro post: vinvashishta.substack.com/p/a-basic-intr…
2/18 #DataScience#MachineLearning
2. Causal Inference. Data Science has taken a hard turn towards causal inference, again to meet increasing model reliability requirements. An education on CI always starts with Pearl. ftp.cs.ucla.edu/pub/stat_ser/r…
3/18 #DataScience#MachineLearning
Companies need a technology turnaround right now and that's a huge opportunity for mid to senior Data Scientists and leaders.
Playing a key role in a turnaround is a career maker. I've been part of 4 and here are some lessons learned.
1/12 #DataScience#MachineLearning#Strategy
Companies only change when they've been through enough pain. That low point and the months immediately after it are where Data Scientists can put forward ideas that will gain traction. Don't push the elephant. Let it lead.
2/12 #DataScience#MachineLearning#Strategy
Companies usually plan without enough information to build a good plan. Data Scientists can help the business understand the possibilities created by our work. That knowledge is critical from the earliest stages.
3/12 #DataScience#MachineLearning#Strategy
The only way Data Science continues to move forward is with an equal focus on research and applications. To generate value, models must perform reliably in production over several years.
For even mid-level Data Scientists, total compensation starts at $250K+. It goes up to $400K. ML Engineers can cost even more depending on their breadth of platform/architecture knowledge and ability to deploy models at scale.